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PREFACE 

This  technical  report  covers  the  work  performed  under  Contract  No. 
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previous  phases,  was  to  provide  the  means  of  achieving  improved  performance, 
modularity  and  flexibility  in  the  design  of  next  generation  microcomputer-based 
missile  guidance  and  control  systems. 

LCDR.  W.  Savage,  Office  of  Naval  Research  Arlington,  VA,  was  the  Navy  LCDR 
Scientific  Officer. 

Mr.  F.J.  Langley  was  the  Principal  Investigator  for  Raytheon,  Mr.  J. 
Demetrick  was  the  Hardware  Design  Engineer  and  Mr.  P.S.  Marchilena  the  Software 
Design  Engineer. 

Publication  of  this  report  does  not  constitute  Navy  approval  of  the 
report's  findings  or  conclusions.  It  is  published  only  for  the  exchange  and 
stimulation  of  ideas. 
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1.  INTRODUCTION 

This  report  presents  the  results  of  the  final  phase  of  the  Navy  Modular  Digital 
Missile  Guidance  study  which  culminates  in  the  design  of  a  high  speed  "super-federated" 
microcomputer  system.  The  latter  Aviates  the  need  to  resort  to  multi  component,  bit- 
slice  computer  architectures  to  meet  the  higher  throughput  processing  requirements  of 
missile  guidance  and  control  and,  more  importantly,  super-federation  supports  modular 
software  by  assigning  one  microcomputer  circuit  to  each  major  algorithm. 

1.1  Background 

In  the  previous  study  phases  (References  R-l  through  R-18),  programmable  digital 
techniques  were  shown  to  offer  improved  performance  and  greater  flexibility  than  the 
traditional  hardwired  analog  implementations  of  seeker  head  control,  signal  processing, 
estimation,  guidance,  autopilot,  warhead  fuzing,  telemetry  and  test  functions. 

To  achieve  modularity  and  growth  in  hardware  and  software,  a  top-down  system 
study  approach  was  adopted,  by  first  dividing  the  entire  range  of  air-to-air  missiles  into 
three  distinguishable  generic  classes,  including  upper  and  lower  performance  boundaries 
within  each  class  (Figure  1-1).  The  major  functions  and  data  rates  amenable  to  digital 
processing  were  then  defined,  determining  their  constituent  software  modules  and  sizing 
these  in  terms  of  computer  throughput  and  memory  requirements,  (References  R-l,  R-2, 
R-3  and  R-6). 

Such  a  modular  breakdown  of  on-board  missile  guidance  and  control  functions, 
together  with  their  associated  interfaces,  provided  the  option  of  configuring  and 
evaluating  either  single  or  multiple  federated/distributed  computer  system  implemen¬ 
tations  according  to  the  design  constraints  of  a  given  missile. 

Simulation  analyses  were  also  performed  to  confirm  computer  requirements  and 
relate  algorithm  complexity  to  performance  improvements  for  the  guidance,  estimation 
and  autopilot/control  functions,  (Reference  R-3). 

With  the  computer  design  requirements  determined  from  these  studies,  a  set  of 
microcomputer  "macromodules"  was  defined  to  support  the  entire  range  of  air-to-air 
missile  functions,  in  either  single  or  multiple/federated  microcomputer  system  configura¬ 
tions  (References  R-3  through  R-7).  The  modules  were  simulated  individually  and  collec- 
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tively,  as  whole  microcomputer  configurations  (References  R-8  and  R-12)  in  order  to 
validate  their  effectiveness  in  missile  guidance  and  control  applications  to  the  point 
where  realistic  product  function  specifications  could  be  prepared,  (Reference  R-14).  To 
broaden  the  effectiveness  of  these  specifications  with  respect  to  current  technology 
developments,  the  compatibility  of  the  Navy  AN/UYK-30  microprocessor  for  general- 
purpose  processing;  charge-coupled  device  (CCD)  technology  for  signal  processing, 
(Reference  R-17);  and  fiber-optic  data  link  device  technology  for  the  serial  digital  system 
bus  were  explored  (Reference  R-14)  through  NOSC  support. 

A  similar  study  was  then  performed  under  NSWC  sponsorship  to  address  the  ship-to- 
air  and  ship-to-ship  missile  requirements,  which  led  to  the  fabrication  of  selected  micro¬ 
computer  macromodules  and  their  application  to  a  Class  I  missile  guidance  and  control 
system,  (Reference  R-19). 

To  accommodate  improvements  in  microcomputer  circuit  technology  without 
major  redesign  and  provide  flexible  memory-mapping  of  input-output  and  main  memory 
storage  space,  a  programmable  Microbus  Interface  Module  (MIM)  was  designed  and  incorpo¬ 
rated  in  each  macromodule,  (References  R-16  and  R-18). 

While  standard  industry,  single  chip  microprocessors  could  be  used  for  certain  mis¬ 
sile  functions,  their  throughput  was  insufficient  for  high-performance  (Class  m)  applica¬ 
tions  such  as  target  seeker  and  autopilot  processing.  To  avoid  the  design  and  fabrication 
of  a  high-speed,  general  purpose  processor  in  Schottky-bipolar  or  CMOS/SOS  circuit 
technology,  with  its  attendant  multicomponent  logistics  and  support  software  problems,  a 
"super-federated"  multimicroprocessor  architecture  was  proposed  for  investigation  and 
design  during  this  phase  of  the  program  (References  R-20  and  R-21).  The  goals  of  this 
study  are  described  in  the  following  Subsection. 

1.2  Objectives  and  Scope 

The  objectives  and  scope  of  the  Phase  VI  study  under  the  modification  of  Contract 
N00014-75-C-0549  are  as  follows. 

The  contractor  shall  continue  the  Digital  Missile  Guidance  study  by  performing 
fundamental  hardware  and  software  analysis  to  determine  the  feasibility  of  utilizing  one 
common  microcomputer  type  throughout  the  modular  digital  missile  concept.  In  this 
Phase  VI,  the  following  tasks  shall  be  performed: 
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1)  Task  l  -  Microstructural  Analysis  -  Analyze  the  high  throughput  require¬ 
ment  functional  groups  defined  in  the  previous  phases  to  determine  the 
practicality  of  decomposing  these  into  microstructures  that  can  utilise 
one  common,  single  chip  microcomputer  as  the  basic  computing  cell. 
Investigate  the  feasibility  of  software  compatibility  throughout  the 
functional  groups  emphasizing  replacing  software  program  linkages 
with  hardware  interfaces. 

S)  Task  t  -  Microstructure  Simulation  and  Evaluation  -  Perform  a  digital 
simulation  of  the  microstructures  defined  in  Task  1  to  prove  the 
intermicrocomputer  timing  and  interface  and  verify  that  the 
throughput  of  each  microcomputer  group  meets  the  requirements  for 
the  complex,  Class  III  missile  defined  in  previous  work. 

1.3  Publications  and  Presentations 

Throughout  the  Modular  Digital  Missile  Guidance  Program  the  results  of  each  phase 
have  been  widely  published  and  presented  to  various  Government  Agencies  and  Industry. 
The  work  has  proven  timely  not  only  in  the  field  of  missile  guidance  and  control  but  in  the 
design  of  any  microcomputer-based  system. 

Twenty-two  papers  and  reports  have  been  published.  The  papers  were  presented  at 
various  conferences  sponsored  by:  IEEE  Computer  Society,  AFIPS/NCC,  NASA/JPL, 
A1AA,  DDR&E/IDA,  SAE,  SPIE,  DPMA,  NATO/AGARD. 

Requests  for  these  papers  were  received  from  several  overseas  countries,  viz: 
Swedish  National  Defense  Research  Institute;  Center  for  Applied  Research  in  Electronics, 
India;  Institute  of  Nuclear  Research,  Poland;  Central  Research  Laboratory,  Mitsubishi 
Electric  Corp.,  Japan;  Rijks  University,  Holland;  The  Weizmann  Institute  of  Science, 
Israel;  institute  of  Radio  and  Electronics,  Czechoslovakia;  Centre  National  De  La 
Recherche  Scientifique,  France. 

Presentations  and  briefings  were  made  to  several  branches  of  the  Navy,  Air  Force, 
Army,  NASA  and  allied  groups  viz:  NAVAIR,  NOSC,  NWC,  NPGS,  NSWC,  SSPO,  NAAFI, 
AFATL,  M1COM,  ABM  DA,  NASA/JPL,  MIT/Draper  Lab. 
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2.  SUMMARY  AND  CONCLUSIONS 

Two  serious  problems  exist  in  military  computer-based  systems: 

0  The  seemingly  exorbitant  cost  of  operational  software  and 
2)  The  absence  of  a  standard  microprocessor-on-e-chip  for  all  comput¬ 
ing  applications. 

The  first  problem  has  been  linked  to  the  size,  complexity,  and  testability  of  com¬ 
puter  programs  as  well  as  the  level  of  the  programming  language  and  more  importantly, 
the  fundamental  structure  and  modularity  of  the  program  assigned  to  a  computer, 
(Reference  R-3). 

The  second  problem  is  purely  a  technology  advancement  issue,  and  the  recent 
advent  of  commercial  16-bit  microprocessor  chips  with  speeds  of  the  order  of  500,000 
instructions  per  second  (References  R-22  and  R-23),  presents  an  opportunity  to  solve  both 
of  the  above  imperfections  in  the  state-of-the-art. 

Hence,  the  main  thrust  of  this  phase  in  the  Modular  Digital  Missile  Guidance  Pro¬ 
gram  has  been  to  exploit  the  low  cost  of  the  16-bit  microprocessor  and  microcomputer  to 
solve  the  high  cost  of  software  and  computer  commonality  problems. 

The  basic  concept  uses  what  has  been  termed  "super-federated"  computer  system 
design  techniques,  (References  R-20  and  R-21)  where  each  major  fUnction/algorithm  is 
assigned  to  a  separate  microcomputer.  A  software  change  is  then  identified  with  a  spe¬ 
cific  integrated  circuit  which,  in  turn,  is  part  of  an  overall  modular  structure.  As  such, 
the  super-federated  design  approach  reduces  program  size  and  complexity,  supports 
software  modularity,  and  uses  one  microprocessor  type.  The  latter,  however,  must  be 
configured  in  a  suitable  multiprocessor  architecture  which  achieves  high  throughput  in 
cases  where  the  speed  of  a  single  microprocessor  chip  is  inadequate. 

This  study  addresses  two  major  design  goals,  software  modularity  and  high 
throughput,  while  avoiding  the  pitfalls,  (both  hardware  and  software)  of  earlier  multipro¬ 
cessor  designs.  The  results  of  the  study,  which  are  manifested  in  the  super-federated 
microcomputer  design  drawings  given  at  the  end  of  this  report,  can  be  summarized  in 
Subsection  2.1. 
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Super-federation  of  individual  functions  among  separate  microcomputers  in  a  prac¬ 
tical  multiprocessor  architecture  provides  the  following  advantages  and/or  solutions  to 
computer  system  software  and  hardware  design  problems: 

1)  Software  Modularity  -  By  assigning  one  major  algorithm/program 
module  per  microcomputer,  a  software  change  can  be  identified  with 
and  confined  to  a  replaceable  integrated  circuit. 

2)  Software  Control  Structure  -  The  interface  between  major  program 
modules  is  a  fixed  hardware  interface  which  facilitates  software 
changes  and  bounds  the  domain  of  each  module.  Further,  the  control 
hierarchy  is  implicit  in  the  multiprocessor  hardware  structure. 

3)  Programming  Simplicity  -  Single  computer  programming  simplicity 
has  been  achieved  within  a  common  memory  map.  Each  microproces¬ 
sor  is  programmed  as  an  entity  with  the  base  page  of  the  memory 
space  dedicated  to  the  operational  program. 

4)  High  Throughput  -  The  high  throughput  functions  in  some  missiles, 
(guidance  and  autopilot,  for  example),  can  be  performed  with  several 
medium  performance  microcomputers  of  the  same  type  by  exploiting 
parallelism  and  overlap  in  the  execution  of  their  constituent  program 
modules. 

5)  Standard  M ircrocomputer  -  Super-federation  allows  the  use  of  one 
microcomputer  type  throughout  the  missile  guidance  and  control 
system,  singly  in  such  cases  as  warhead  fUzing  or  in  a  multiprocessor 
configuration  for  the  more  complex  high-speed  functions. 

6)  Common  Support  Software  -  The  use  of  one  microcomputer  type  for 
both  high  and  low  throughput  requirements  obviates  the  need  to 
design  and  build  a  high  speed  processor  with  a  unique  instruction  set 
and  its  attendant  support  software  development  cost  and  risk. 
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3.  DIGITAL  MISSILE  GUIDANCE  AND  CONTROL 

Despite  the  many  functional  advantages  of  digital  versus  analog  systems,  the 
simple  substitution  of  a  small  general  purpose  computer  in  place  of  the  former  analog 
circuits  does  not  in  itself  solve  all  the  problems  encountered  in  the  life  cycle  of  a  missile. 
The  hardcore  problems  of  advancing  technology,  changing  threat  situations,  systems  inte¬ 
gration  and  logistics,  together  with  the  ever  increasing  cost  of  software,  can  result  in  an 
excessive  premium  paid  for  digital  missiles. 

While  throughput  could  be  satisfied  with  a  single,  high  performance,  miniclass 
computer  and  a  dedicated,  special-purpose  target  sensor  signal  processor,  form-factor  and 
electrical  interface  problems  arise  due  to  the  many  analog  and  digital  discrete  signals 
being  converted  and  processed  at  a  central  point  as  opposed  to  being  handled  at  the 
source.  In  addition,  the  design,  assembly  and  checkout  of  major  missile  sections/func¬ 
tions,  (e.g.,  seeker,  warhead,  flight  control,  telemetry),  as  completely  operational  modules 
are  not  possible  with  a  single  computer  design  approach. 

From  a  software  viewpoint,  programming  complexity  increases  with  program  size 
and  the  time  multiplexing  of  individual  missile  functions  to  meet  the  sampling  and  compu¬ 
tational  delay  requirements  of  the  various  control  loops.  This  resulting  modification  or 
updating  of  any  given  function  within  the  total  program  is  then  fraught  with  virtually 
unknown  and  complex  software  interface  problems,  a  situation  which  worsens  as  the  level 
of  coding  diminishes.  In  other  words,  the  interface  problems  cited  for  analog  systems  can 
reappear  in  digital  missiles  at  the  computer  input-output  interface  and  in  a  more  devious 
manner  within  the  invisible  internal  software.  Although  software  modularity  supports  the 
system  flexibility  requirement  for  changing  threat/mission  situations,  it  has  nevertheless 
proven  difficult  to  achieve  and  maintain  through  a  development  cycle. 

3.1  Motivations  for  Federated  and  Super-Federated  Systems 


The  motivations  for  designing  and  building  federated  systems  stem  from  the 
shortcomings  of  single  computer  systems  cited  in  the  previous  paragraphs  and  the  availa¬ 
bility  of  low-cost,  large-scale  integrated  (LSI)  circuit  microcomputers  and  associated 
I/O  support  circuits. 
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3.1.1  Hardware 


Federated  microcomputer  systems  simplify  subsystem  design,  manufacture, 
interface,  test  and  the  inevitable  modifications  and  updates.  Figure  3-1  serves  to 
illustrate  the  major  differences  between  single  and  federated  computer  design 
approaches.  In  the  case  of  the  single  computer  system,  a  relatively  large  high  perform¬ 
ance  minitype  computer  is  subject  to  the  varying  form  factor  constraints  of  a  missile.  To 
move  the  computer  to  a  different  location  invariably  entails  the  repackaging  of  hardware 
to  fit  the  space  available.  A  multiwire  analog  and  digital  interface  problem  also  results 
from  the  concentration  of  data  processing  and  conversion  in  one  place. 

In  contrast,  the  federated  microcomputer  system  performs  the  data  conver¬ 
sion  and  processing  tasks  at  source,  within  each  major  subsystem,  and  allows  a  standard 
serial  digital  multiplex  interface  to  be  used  between  subsystems  and  the  launcher.  This 
partitioning  is  discussed  at  greater  length  in  subsequent  sections  of  this  report. 
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Figure  3-1  -  Single  versus  Federated  Missile  Guidance  and  Control  Systems 
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3.1.2  Software 
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The  merits  of  modular  structured  software,  although  well  appreciated  in 
this  day  and  age,  are  somewhat  idealistic  and  difficult  to  achieve  and  maintain.  Figure  3- 
2  illustrates  a  rational,  modular,  hierarchical  control  structure  for  a  single  computer 
missile  guidance  and  control  system.  All  calls  are  made  downward  from  the  executive  to 
subordinate  mode  supervisors  and  supporting  functional  program  modules.  However, 
under  the  normal  pressure  of  tight  development  schedules  the  finished  software  is  subject 
to  shortcuts  which  invariably  violate  the  original  clean  modular  lines  of  the  control  struc¬ 
ture.  The  outcome  of  a  degradation  in  software  modularity  is  realized  more  in  the  later 
phases  of  the  development  process  when  changes  and  substitutions  are  required  (Figure  3- 
3).  Since  software  costs  are  pegged  to  ever-increasing  labor  rates,  the  impact  of  the 
deficiencies  in  the  design  structure,  together  with  other  significant  factors  outlined 
below,  are  not  apparent  until  the  total  cost  of  the  finished  product  is  paid. 


Figure  3-2  -  Modular  Hierarchical  Software  Control  Structure  for 
Single  Computer  Missile  Guidance  and  Control  System 
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Figure  3-3  -  Single  Computer  System  Software  b 


inaccessible  to  Subsystem  Designers 

Cost  Per  Instruction  = 

Design  Cost+Coding  Cost+Verification  Cost -^Maintenance  Cost 

No.  Lines  of  Code 

Determining  factors: 

1)  Predominantly  labor  costs,  dependent  upon: 

a)  Firmness  of  Requirements 

b)  Proportion  New  versus  Proven  Algorithms 

c)  Size 

d)  Complexity 
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2)  Number  lines  of  code,  dependent  upon: 


a)  Number  Functions  Assigned  to  Software 

b)  Level  of  Programming  Language 

Experienced  software  costs  over  the  past  few  years  indicate  an  average  cost 
of  $40  to  $60  per  instruction  for  a  fully  commissioned  system  in  terms  of  new,  real-time, 
operational  programs,  and  between  $8  and  $30  per  instruction  for  more  standard  routines. 
Whereas  the  cost  of  semiconductor  memory  is  estimated  to  be  in  the  order  of  millicents 
per  bit,  a  50-word  subroutine  typically  costs  $3000  as  a  finished  product.  Microcomputer 
hardware,  on  the  other  hand,  enjoys  a  volume  market  with  modules  selling  in  the  tens  of 
dollars.  This  situation  emphasizes  the  need  to  be  able  to  reuse  or  recycle  program  mod¬ 
ules  and  to  curb  the  tendency  of  designers  to  ndo  it  in  software"  when  in  doubt  about  the 
requirements  of  a  specific  system  function. 

3.1.3  Throughput 

Studies  have  shown  that  the  throughput  requirements  for  single  computer 
systems  can  reach  the  two  million  operations  per  second  (two  MOPS)  mark  (Figure  3-4). 
While  a  machine  could  be  designed  and  built  to  meet  the  speed  requirement,  the  tendency 
has  been  to  add  more  functions,  during  the  initial  system  development  cycle  and  later, 
throughout  the  life  span  of  the  missile.  Since  there  is  a  finite  limit  to  the  speed  of  the 
original  computer,  the  increased  load  must  either  be  accommodated  by  redesigning  the 
machine  or  outrigging  satellite  processors  to  absorb  the  overflow  which,  in  turn,  tends 
toward  a  distributed  system  of  haphazard  design. 
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Figure  3-4  -  Single  Computer  System  General-Purpose  Throughput  Requirements 
3.2  Defining/Identifying  System  Structures 


Before  embarking  upon  the  design  of  any  federated  system,  it  is  important  to 
consider  the  whole  system  as  opposed  to  applying  federated  techniques  on  a  piecemeal 
basis.  The  reason  for  this  is  to  identify  the  characteristic  structure  of  the  system  in 
terms  of  its  constituent  functions,  data  flow  and  data  rates.  Figure  3-5  shows  the  major 
functions  of  a  typical  missile  system  and  their  relationship  to  one  another. 
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Figure  3-5  -  Typical  Missile  Guidance  and  Control  System  - 
Functional  Block  Diagram 

In  the  system  shown,  the  target  sensor  is  mounted  on  a  gimballed  platform  stabi¬ 
lized  against  missile  body  motion  by  platform-mounted  rate  gyros  and  torquers  in  conjunc¬ 
tion  with  the  seeker  head  control  electronics.  Target  sensor,  e.g.,  radar,  infrared  (1R) 
electro-optical  (EO),  outputs  are  processed  by  the  signal  processor  which  provides  target 
range  and  angle  data  (angle  only  for  IR  and  EO  sensors),  for  subsequent  filtering  and  proc¬ 
essing  into  boresight  error  and  'g'  commands  using  appropriate  estimation  and  guidance 
law  algorithms.  The  latter  "steering"  data  controls  the  seeker  platform  and  autopilot  for 
target  intercept.  The  autopilot  also  stabilizes  the  airframe  against  body  motion  and 
bending  effects  using  body  gyros  and  accelerometers  as  data  sources  and  outputting  fin 
deflection  commands  to  fin  control  actuators.  Detonation  of  the  warhead  is  determined 
by  the  detection  of  the  target  by  the  warhead’s  target  detection  device  augmented  with 
end  game  geometry  data  from  the  primary  target  sensor  signal  processor  and  estimator. 
The  form  of  motor  control  can  vary  from  a  simple  squibbing  signal  from  the  weapon  con¬ 
trol  system,  via  the  umbilical,  to  sophisticated  fuel  control  based  on  temperature,  pres¬ 
sure  and  aerodynamic  data  in  the  case  of  ramjet  propulsion  units. 
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The  degree  of  interaction  between  the  functional  components  of  the  system,  their 
physical  relationship,  system  modularity  requirements,  and  the  magnitude  of  the  process¬ 
ing  task  in  each  case,  influences  the  structure  of  the  practical  distributed  microcomputer 
system. 

3.2.1  System  Timing  Considerations 

The  basic  or  characteristic  structure  of  a  system,  as  far  as  its  implemen¬ 
tation  with  distributed  microcomputers  is  concerned,  is  determined  by  the  system  timing 
constraints  and  the  autonomy  of  functions.  Figure  3-6  shows  the  system  of  the  previous 
figure  with  switches  interposed  between  the  major  functional  blocks  and  the  associated 
sampling  or  update  rates  indicated  to  satisfy  the  Nyquist  criteria.  Three  major  control 
loops  are  visible:  seeker  head,  autopilot  and  steering  command.  The  first  two  of  the 
latter  require  relatively  high  sampling  rates  (125-500  Hz),  to  meet  the  bandwidths 
involved,  whereas  the  steering  command  update  rate  is  quite  low  (10-20  Hz).  Also,  the 
two  high  speed  loops  are  virtually  autonomous  with  their  respective  sensors  and 
torquers/actuators. 
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Figure  3-6  -  Digital  System  Timing  Considerations 
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3.2.2  System  Parallelism 


Figure  3-6  views  the  system  as  a  set  of  functional  blocks,  but  if  the  system 
is  redrawn  to  reflect  the  planar  control  channels  of  pitch  and  yaw,  for  the  seeker 
gimballed  platform,  pitch,  roll  and  yaw  for  the  autopilot,  branching  out  into  four  fin  con¬ 
trol  channels,  then  parallelism  becomes  evident  (Figure  3-7).  TTje  latter  system  charac¬ 
teristic  offers  potential  for  using  several  low  throughput  microcomputers  as  opposed  to  a 
few  high  throughput  machines. 

3.2.3  Macro-Structure  System 


Based  upon  the  system  as  it  appears  in  Figure  3-6,  the  obvious  macro¬ 
structure  which  exploits  subsystem  autonomy  and  low  intersubsystem  data  rates  is  as 
shown  in  Figure  3-8.  One  microcomputer  is  assigned  to  each  major  subsystem  and  a 
common  input-output  (I/O)  interface  interconnects  subsystems  via  a  system  bus  at  the  low 
10  Hz  update  rate.  In  terms  of  control  hierarchy,  the  target  seeker  microcomputer  con¬ 
trols  the  system  bus  since  all  other  subsystems  are  subordinate  "users"  of  the  seeker  data 
(Figure  3-9).  This  form  of  distributed  microcomputer  system  is  a  true  federated  system, 
since  each  microcomputer  operates  virtually  autonomously.  Further,  it  meets  the  subsys¬ 
tem  modularity  design  goal  whether  subsystems  are  colocated  physically  or  not.  How¬ 
ever,  there  is  one  major  drawback  to  this  level  of  partitioning,  as  shown  in  Figure  3-10. 

Throughput  requirements  for  the  individual  microcomputers  vary  widely  from 
up  to  one  MOPS  to  as  low  as  50  KOPS.  As  a  result,  the  high  throughput  requirements 
of  the  seeker,  flight  control  and  head  control  functions  indicate  a  bit-slice  Schottky- 
bipolar  or  complimentary  metal-oxide  semiconductor,  silicon-on-sapphire  (CMOS-SOS) 
device  technology  machine,  and  the  remaining  low  throughput  functions  as  a  single¬ 
chip  microcomputer.  The  cost  of  designing  and  building  the  bit-slice  PCs  and  necessary 
support  software  is  something  to  be  avoided  if  possible;  hence  the  need  arises  to  explore 
alternative  implementations  using  one  type  of  microcomputer-on-a-ehip  throughout 
the  system. 
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Figure  3-8  -  Macro-Structure  Partitioning 
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Figure  3-9  -  Macro-Structure  Control  Hierarchy 
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Figure  3-10  -  Microcomputer  Throughput  Requirements  for  Macro-Structured  System 
3.2.4  Super-Federated  Systems 


The  high  throughput  macro  functions  identified  in  the  previous  paragraphs  have  the 
potential  of  being  broken  down  into  "microstructures"  exploiting  the  intrinsic 
parallelism  and  overlap  timing  characteristics  of  the  system.  Figure  3-11  illustrates  the 
use  of  separate  single-chip  microcomputers  for  each  subfunction  within  the  major  func¬ 
tions  of  target  seeker  and  autopilot. 

In  the  case  of  the  target  seeker  processing  group  a  "heel-to-toe"  computing  sequence 
is  evident  since  each  microcomputer  is  waiting  for  the  output  of  a  preceding  subfunction. 
However,  certain  preliminary  operations  can  proceed  while  waiting  for  real-time  update 
information,  e.g.,  state  estimation.  Further,  since  the  spectrum  analysis  subfunction 
is  a  fixed  entity  i.e.,  either  a  64,  128  or  256-point  fast  Fourier  transform  (FFT)  process, 
then  this  should  be  executed  in  a  high-speed  special  purpose  processor  to  allow  more 
time  in  the  overall  budget  of  20  or  so  milliseconds  for  the  slower  general-purpose  pCs 
to  execute  their  respective  tasks.  In  other  words,  software  is  used  where  flexibility 
is  required. 
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Figure  3-11  -  Super-Federated  Subsystem  Processing,  Target  Seeker  and  Autopilot 

The  autopilot  case  is  quite  different.  As  was  noted  earlier  (Figure  3-7),  three- 
axis  control  can  be  exploited  through  parallel  processing,  thereby  allowing  several  relatively 
low  speed  y  Cs  to  be  used  to  perform  a  high-speed  composite  function. 

Software  modularity  is  enhanced  in  each  of  the  above  cases,  since  the  functional 
program  modules  shown  in  Figure  3-2  are  now  visible  as  separate  single-chip  microcom¬ 
puters.  Taken  to  an  extreme,  a  1:1  correlation  between  the  program  modules  of  Figure  3- 
2  and  u  Cs  would  ensure  software  modularity  and  provide  a  fixed  hardware  interface 
between  software  routines.  Subroutine  calls  would  then  be  handled  by  hardware  linkages 
between  y  Cs.  The  situation  depicted  in  Figure  3-3  could  conceivably  be  transformed 
into  the  more  desirable  state  of  affairs  shown  in  Figure  3-12,  where  a  subfunction  change 
is  performed  by  the  simple  replacement  of  a  single-chip  microcomputer  with  the  cor¬ 
rectly  programmed  alternative. 
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Figure  3-12  -  Software  Change  by  Hardware  Substitution 
in  Super-Federated  Systems 


3.3  Microcomputer  Modularity 

To  cover  the  range  of  missile  throughput  requirements,  a  set  of  microcomputer 
macromodules  was  defined  (Tabel  3-1).  Memory-mapped  I/O  is  used  to  eliminate  Direct 
Memory  Access  (DMA)  to  "main"  memory  and  the  associated  control  circuits.  Figure  3-13 
shows  the  grouping  together  of  modules  to  form  a  federated  missile  guidance  and  control 
system. 

The  crux  of  modularity  at  the  microcomputer  level  was  the  definition  of  a  standard 
microbus  ,  (Reference  R-14)  oriented  toward  standard  industry  semiconductor  memory 
circuit  interfaces,  i.e.,  read-write/random-access  memories  (RAMs)  for  data  storage  and 
read-only  memories  (ROMs)  for  programs. 
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Figure  3*13  -  Modular  Federated  Microcomputer  Miaaile 
Guidance  and  Control  System 
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3.3.1  Programmable  Mierobus  interface  Module 

In  the  microcomputer  industry  no  two  microbus  interface  schemes  are  the 
same,  e.g.,  S-100  Bus,  Intel  MULTIBUS,  National  Microbus,  etc.,  and  similarly,  the  electri¬ 
cal  interfaces  of  available  support  modules  varies,  e.g.,  analog-to-digital  (A-D)  and  digital- 
to-analog  (D-A)  converters,  memory  modules,  and  serial  digital  interface  modules. 

Through  the  definition  of  an  independent  microbus  interface  a  programma¬ 
ble  microbus  interface  module  (MIM),(References  R-16  and  R-18)  was  designed.  This 
module  employs  high-speed  field  programmable  logic  arrays  (FPLAs)  and  programmable 
read-only  memories  (PROMs),  to  interface  standard-industry  microcomputer  components, 
i.e.,  microprocessors,  RAMs,  ROMs,  multiplexer  A-D  converters,  D-A  converters  and 
serial  digital  I/O  modules  with  the  microbus.  Further,  each  individual  component  can  be 
replaced  with  a  more  desirable  product  from  a  different  manufacturer,  at  any  time  during 
the  life  cycle  of  the  system,  by  reprogramming  the  MIM  to  accommodate  the  interface 
peculiarities  of  the  new  product. 

3.3.2  Frequency  Spectrum  Analyzer  Module 

Missile  radar  target  seeker  signal  processing  requirements  are  low  compared 
to  avionic  and  ground-based  air  defense  systems  (Reference  R-3)  (Figure  3-14). 
Nevertheless,  the  frequency  spectrum  analyzer  (FSA)  module  of  Table  3-1  using  bit-slice 
microprocessor  circuits,  requires  approximately  150  LS1/MSI/SS1  circuits,  dissipates 
approximately  50  W,  using  Schottky-bipolar  circuit  technology,  and  executes  a  64-point 
complex  FFT  in  approximately  300  usee,  meeting  only  Class  land  U  missile  performance 
requirements  (References  R-6,  R-14  and  R-17).  Such  a  processor  dwarfs  the  single-chip 
microcomputer  (Figure  3-15).  In  contrast,  a  charge-coupled  device  (CCD)  processor  using 
the  chirp-Z  transform  (CZT)  and  transversal  filters,  executes  an  equivalent  64-point  analy¬ 
sis  in  approximately  13  usee,  with  a  power  dissipation  of  less  than  5  W,  meeting  all 
three  missile  class  requirements  (References  R-17,  R-24  and  K-25).  While  dark  current  is 
a  limiting  factor  in  the  dynamic  range  of  analog  CZT  processors  at  the  upper  end  of  the 
MIL  temperature  range,  recent  improvements  in  prototype  surface  channel  CCDs  at  Ray¬ 
theon  and  elsewhere  (Reference  R-26)  indicate  a  temporary  situation  in  this  performance 
deficiency.  Further,  based  on  recent  NASA/T1  work,  a  2-chip  CCD  CZ1  processor 
appears  feasible  in  the  near  future. 
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Figure  3-14  -  Radar  Signal  Processing  Throughput  Requirements 
3.3.3  Serial-Digital  Input-Output  (SDIO)  Module 


The  SDIO  module  provides  a  MIL-STD-1553B-  compatible  serial  digital  multi¬ 
plex  bus  interface  between  microcomputers  in  the  missile  and  the  external  weapon  con¬ 
trol  system  (Reference  R-27).  Using  conventional  transformer  coupling  to  the 
transmission  line  requires  relatively  large,  high-current,  line  driver,  receiver  and 
transformer  components  which,  in  turn,  are  inconsistent  with  today's  single-chip  microcom¬ 
puters  and  the  small  size,  weight  and  power  limitations  of  a  missile.  Fiber-optic  coupling 
between  subsystem  microcomputers,  using  simple  LED/PIN  diode/T2L  interface  compo¬ 
nents  (Reference  R-28)  and  single-chip  Manchester  n/NRZ  code  converters  (Reference 
R-29)  reduces  the  serial  I/O  interface  hardware  to  more  realistic  proportions  (Reference 
R-14).  However,  the  single  party-line  bus  is  not  currently  amenable  to  fiber-optic 
technology,  since  T-couplers  introduce  a  3  dB  loss  at  each  drop  point.  A  simple  alterna¬ 
tive  is  the  ring  system  of  Figure  3-15,  using  a  round-robin  protocol  (Reference  R-30).  A 
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Figure  3-15  -  Fiber-Optic  Ring  Communications  Between  Missile  Subsystems 


more  complex  multiline  approach  is  the  star  configuration  which  would  be  suitable  for  a 
simple,  single-mode,  short-range  missile  where  the  seeker  becomes  the  focalpoint.  Eight- 
port  couplers  have  been  built  under  Air  Force  contracts  (Reference  R-31). 

3.4  Navy  Demonstration  System 

The  culmination  of  the  above  work  has  been  the  fabrication  of  a  basic  federated 
microcomputer  guidance  and  control  system  under  a  NSWC  contract  (Reference  R-19). 
This  microcomputer  system  constitutes  the  "hardware-in-the-loop"  element  of  a  real¬ 
time  missile  simulation  to  evaluate  the  performance  of  the  federated  approach  under 
the  constraints  of  a  MIL-STD-1553B  I/O  protocol  (Figure  3-16). 

Breadboard  versions  of  the  pC  macromodules  have  been  designed  and  built  using 
standard  industry  pC  components  integrated  with  microbus  interface  modules  (MIMs),  Fig¬ 
ure  (3-17).  The  simulation  is  based  upon  a  modular  digital  missile  guidance  simulation 
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Figure  3-16  Navy/Raytheon  Hardware-in-the-Loop  Federated 
M  C  System  for  Missile  Performance  Simulation 
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Figure  3-17  -  Federated  Microcomputer  Macromodules  Using  MIM  Interface 
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system  developed  for  NSWC  under  a  separate  contract,  (Reference  R-32).  System  growth 
is  achieved  by  adding  additional  microcomputers  to  the  system  bus  and 
transferring/recoding  simulation  program  modules  to  be  executed  by  the  appropriate 
m  icrocompu  ter<s). 

Figure  3-18  shows  modular  growth  from  the  simple  low-performance  guidance  and 
control  system  of  Figure  3-17  to  a  high-performance  super-federated  system  using  several 
microprocessors  of  the  same  type  and  maintaining  the  original  data  memory  and  I/O  mod¬ 
ules.  Each  microprocessor  executes  only  one  algorithm  using  a  dedicated  program 
memory  chip.  This  arrangement  ensures  software  modularity  and  programming  simplicity 
while  minimizing  u  bus  traffic. 

3.5  Summary 

Federated  microcomputer  systems  provide  the  flexibility  to  design,  develop, 
modify  and  update  missile  guidance  and  control  systems  on  an  individual  subsystem  basis, 
thereby  enhancing  system  modularity.  Standard  industry  microcomputer  components 
which  meet  military  environmental  specifications  can  be  integrated  into  a  set  of  micro¬ 
computer  macromodules  using  a  standard  programmable  interface  module  and  microbus. 
To  achieve  and  maintain  modularity  in  software,  the  potential  exists  to  assign  each  major 
program  module  to  a  separate  single-chip  microcomputer,  placing  a  fixed  hardware  inter¬ 
face  between  major  function  algorithms.  Furthermore,  by  exploiting  parallelism  and/or 
the  time  overlapping  of  function  execution,  the  use  of  several  standard-industry,  single¬ 
chip  microcomputers  in  a  "super-federated"  configuration  eliminates  the  need  to  resort  to 
one-of-a-kind,  high-speed,  bit-slice  processors  for  high  performance  missiles,  with  their 
attendant  hardware  and  software  logistics  support  problems.  In  terms  of  signal  process¬ 
ing,  improved  charge-coupled  device  technology,  in  the  form  of  chirp-Z  transform  proces¬ 
sors  using  transversal  filters,  offers  a  solution  to  the  high  chip/parts  count  of  current  fast 
Fourier  transform  processors.  A  two-chip  CZT  processor  would  match  the  level  of  large- 
scale  circuit  integration  presently  available  in  microcomputer  technology. 

In  cases  where  federated  microcomputer  systems  are  distributed  physically 
throughout  a  missile,  relatively  low  performance,  fiber-optic,  serial-digital  communica¬ 
tions  between  mictocomputer-based  subsystems  using  a  round-robin  protocol  eliminates 
the  high  power,  transformer-coupled  interface  of  traditional  electrical  bus  systems. 
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Figure  3-18  -  Super-Federated  Microcomputer  System  for  Higher  Performance  Missile 

Guidance  and  Control 
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4.  CLASSIC  MULTIPROCESSOR  ARCHITECTURES 

Before  embarking  on  the  super-federated  microcomputer  system  architectural 
design,  a  review  of  earlier  multiprocessor  architectures  was  performed  to  determine  their 
respective  merits  and  failings  both  from  a  hardware  and  software  viewpoint. 

Although  large  physically,  due  to  the  state-of-the-art  in  hardware  at  the  time  of 
construction,  these  earlier  architectures  become  classical  in  terms  of  the  various 
approaches  adopted  to  solve  such  problems  as  throughput,  availability  and  growth  for  both 
random  and  highly  repetitive  computing  tasks. 

Further,  the  deficiencies  experienced  in  these  architectures  (particularly 
software),  are  as  important  today  as  earlier,  except  of  course  for  the  shortcomings  result¬ 
ing  from  the  number  of  discrete  components  and  their  associated  failure  rates. 

4.1  Multiprocessor  and  Computer  Systems 


To  overcome  the  deficiencies  of  single,  uniprocessor  computer  systems  for  high- 
performance,  high-availability  applications,  viz:  limited  throughput;  failure  upon  a  single 
fault;  restricted  growth  in  size  and  performance;  various  architectures  incorporating 
either  several  of  the  basic  elements  of  a  computer,  i.e.  CPUs,  memories,  and  IOUs, 
(multiprocessor  systems),  or  several  whole  computers  (multicomputer  systems)  have  been 
designed  and  built.  These  systems  are  characterized  by  their  ability  to  perform  the 
simultaneous  or  parallel  execution  of  similar  and/or  different  tasks  at  several  times  the 
speed  of  a  single  sequential  machine. 

Multiprocessor  systems  are  essentially  expanded  and  more  complex  versions  of  the 
basic  Von  Neumann  uniprocessor,  (Reference  R-33),  usually  performing  a  centralized  role 
in  a  given  system.  However,  a  significant  drawback  to  certain  types  of  past 
multiprocessor  systems  has  been  the  executive  processing  load  associated  with  the 
efficient  utilization  of  processors  in  a  multi-task  operating  environment,  (Reference  R- 
34).  Since  this  overhead  remains  sequential,  system  throughput  is  not  linearly  propor¬ 
tional  to  the  number  of  processors  employed. 

Multicomputer  systems,  on  the  other  hand,  are  composed  of  several  relatively 
simple  and  familiar  computers  interconnected  via  their  I/O  units  (IOUs).  Multicomputer 
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systems  normally  function  as  decentralized,  distributed/federated  systems  with  each 
computer  dedicated  to  a  specific  set  of  interrelated  tasks  and  external  devices  (EDs). 

Communication  within  a  multiprocessor  system  involves  megaword  per  second 
information  transfer  rates  whereas  within  a  multicomputer  system,  i.e.,  between  IOUs, 
functional  partitioning  is  designed  to  achieve  transfer  rates  in  the  kiloword  per  second 
range . 

The  various  characteristic  forms  of  multiprocessor  and  multicomputer  systems 
built  to  date  involve  either  single  or  multiple  data  busses,  or  a  cross-point  switching 
matrix  for  communication  between  major  computer  elements  or  computers  respectively. 
In  the  following  examples  reviewed  it  car,  be  seen  that  it  is  both  the  type  of  communica¬ 
tion  employed  and  the  degree  of  customization  of  the  architecture  to  the  type  of  process¬ 
ing  task  to  be  executed  which  characterizes  the  system,  whether  multiprocessor  or 
multicomputer. 

4,1.1  Single  Time-Shared/Party-Line  Bus 

The  most  simple  form  of  multiprocessor  and  multicomputer  system  employs 
a  single,  time-shared/party-line  communications  bus.  These  architectures  achieve  the 
highest  throughput  only  when  accesses  to  the  bus  can  be  scheduled  to  avoid  user  conflicts. 
Of  the  two  types  of  computer  system,  the  multiprocessor  (Figure  4-1)  is  more  throughput 
limited  by  the  single  bus  than  its  multicomputer  counterpart  due  to  the  lack  of  autonomy 
of  the  individual  processors  and  their  dependence  upon  access  to  a  common/shared  "main" 
memory.  The  multicomputer  system  (Figure  4-2)  however  is  far  more  amenable  to  the 
single-bus  for  inter-IOU  communication  due  to  the  low  transfer  rates  in  a  properly  parti¬ 
tioned  system.  In  the  example  shown,  bus  accesses  can  either  be  controlled  by  a  master- 
slave  hierarchy  or  on  a  round-robin  basis  to  eliminate  conflicts.  Furthermore,  the  low 
inter-IOU  transfer  rates  in  a  well  designed  multicomputer  system  enables  a  serial  digital 
multiplex  bus  to  be  employed,  affording  higher  reliability  through  simple  duplication. 
Current  technology  in  serial  data  transmission  also  offers  virtually  error-free  perform¬ 
ance  at  1  MHz  bit  rates  and  using  a  ring  bus  with  a  round-robin  I/O  protocol,  a  low  cost 
fiberoptic  data  link  becomes  practicable.  Figure  4-3  shows  a  simple  missile  guidance  and 
control  system  using  one  microcomputer  (pc)  per  subsystem,  serial  digital  input-output 
(SDIO)  channel  and  fiber-optic/T^L  transmitter  (T)  and  receiver  (R)  interface  circuits. 
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Figure  4-1  -  Multiprocessor  System  -  Single  Time-Shared  Memory  Access  Bus 
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Figure  4-2  -  Multicomputer  System  -  Single  Time-Shared  Communications  Bus 
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Figure  4-3  -  Multicomputer  System  -  Fiber-Optic  Coupled  Round-Robin 

Serial-Digital  I/O 


4.1.2  Multiple  Bus 

This  form  of  communication  within  a  computer  system  is  more  commonly 
encountered  in  multiprocessor  systems  to  overcome  the  speed  limitations  of  the  single, 
party-line  bus,  thus  trading  off  simplicity  for  increased  speed,  size,  weight,  cost  and  com¬ 
plexity.  Figure  4-4  shows  a  typical  multiple  bus  multiprocessor  which  has  been  with  us 
for  well  over  a  decade.  Each  memory  user  (processors  and  IOUs)  has  a  separate  bus  to 
access  any  memory  bank.  Conflicts  in  accessing  the  same  memory  are  resolved  in  each 
memory  bank  by  a  multiplexer  (MUX)  with  priority  logic.  Optimum  speed  is  achieved 
when  processors  can  use  separate,  dedicated  memory  banks  for  their  respective  instruc¬ 
tion  and  operand  accesses  coupled  with  infrequent  accesses  to  shared  data  bases  and 
similarly  infrequent  DMA  I/O  transfers.  In  earlier  systems  using  destructive  readout 
(DRO)  core  memories  with  relatively  long  data  transfer  cycles,  it  was  possible  to  access 
the  stored  data  before  the  completion  of  the  full  memory  cycle,  thereby  enabling  the 
partial  overlapping  of  instruction  and  operand  fetch  cycles  when  the  latter  were  stored  in 
separate  memory  units.  Such  fine  tuning  techniques  are  neither  possible  nor  worthwhile 
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Figure  4-4  -  Multiprocessor  System  -  Multiple  Memory  Access  Bus 

with  today's  high-speed  semiconductor  memories.  Under  these  most  favorable  conditions, 
virtually  a  multicomputer  operating  mode,  throughput  for  a  dual  microprocessor  system 
approaches  twice  that  of  a  uniprocessor. 

A  multicomputer  system  employing  independent  I/O  busses  is  shown  in  Figure  4-5. 
Such  a  system  requires  multiplexed,  direct-memory-access  (DMA)  IOUs. 

4.1.3  Cross-Point  Switch 


This  form  of  communication/coupling  between  computer  elements  or  com¬ 
puters  was  first  described  by  H.A.  Keit  in  1960  and  formed  the  essence  of  what  was 
termed  the  "Polymorphic"  concept  (Reference  R-35  and  R-36)  i.e.,  a  system  having  "many 
shapes".  One  of  the  major  objectives  was  to  decentralize  system  control  by  making  a 
passive  switch  the  central  element  instead  of  a  single  processor.  Figure  4-6  shows  a 
multiprocessor  configuration  employing  the  polymorphic  principle  which  was  used  in  a 
high  availability  tactical  air  defense  system  (Reference  R-37).  Figure  4-7  is  an  example 
of  a  multicomputer  version.  With  reliable  solid-state  switches  these  systems 
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Figure  4-5  -  Multicomputer  System  -  Multiple  I/O  Communications  Bus 


Figure  4-6  -  High-Availability  Multiprocessor  System  -  Cross-Point 
Switch  Communications 
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Figure  4-7  -  Multicomputer  System  -  Cross-Point  Switch  I/O  Communications 

provide  high  availability,  flexibility  and  growth  capabilities,  and  although  of  necessity 
confined  to  ground  systems  in  the  past,  due  to  the  size  and  weight  of  available  hardware, 
a  polymorphic  multi-micro  computer  system  with  serial  communications  and  LSI  switch¬ 
ing  becomes  a  viable  candidate  for  tactical  avionic  and  missile  systems  (Figure  4-8). 

4.1.4  Array  Processor  Systems 


Array  processors  achieve  high  throughput  for  a  limited  range  of  tasks, 
thereby  trading  general-purpose  features  for  speed.  On  one  of  the  first  forms  of  array 
processor  the  register  arithmetic  and  logic  unit  (RALU)  of  the  uniprocessor  was  effec¬ 
tively  replace  J  by  a  matrix  or  array  of  processing  units,  each  interconnected  to  its 
neighbor,  and  the  whole  designed  to  achieve  high  throughput  for  mesh  oriented  problems 
using  a  common  instruction  stream. 
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Figure  4-8  -  Tactical  Multimicrocomputer  System  -  Serial  Cross-Point  Switch 

I/O  Communications 

Speed  improvement  of  array  processors  over  the  uniprocessor  can  be  a  fac¬ 
tor  equal  to  the  number  of  processing  units  in  the  array  provided  that  they  are 
continuously  active.  The  availability  of  such  systems  for  tactical  applications  is  degraded 
by  system  complexity  and  the  single  common  source  of  instructions.  The  latter  defi¬ 
ciency  however  could  be  overcome  with  autonomous  processing  units. 

4.1.4.1  Single  Integrated  Array 

The  SOLOMON  II  (Reference  R-38)  provides  a  good  example  of  an 
array  processor  system  using  a  single  integrated  array  (Figure  4-9).  The  processing  units 
in  the  array  are  virtually  small  microcomputers,  each  incorporating  a  128  x  32-bit 
memory. 
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Figure  4-9  -  Array  Processor  System  -  Single  Integrated  Array 

4.1.4.2  Multiple  External  Array 

The  early  ILLIAC  IV  (Reference  R-39)  shown  in  Figure  4-10 
employed  four  arrays  and  four  control  units,  thus  significantly  improving  systems  availabil¬ 
ity  and  flexibility  by  eliminating  the  dependence  upon  a  single  control  unit  (CU)  and 
instruction  stream.  The  master  executive  is  resident  in  an  external  host  computer  which 
impacts  upon  the  overall  system  reliability.  The  ILLIAC  IV  array  processing  units  were 
again  effectively  microcomputers,  each  with  a  high-speed  2K  x  64-bit  memory  and  parallel 
arithmetic  and  logical  unit.  Figure  4-11  illustrates  a  possible  4x4  microcomputer  array. 
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Figure  4-10  -  Array  Processor  System  -  Multiple  External  Array 


4  x  4  uC  ARRAY 


Figure  4-11  -  Array  Microcomputer  System  -  Single  External  4x4  Array 
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4.1.5  Associative  Processor  Systems 

In  common  with  the  array  processor,  the  associative  processor  (Reference 
R-40)  achieves  high  throughput  by  executing  a  specific  class  of  functions  which,  instead 
of  suiting  an  array  of  processing  units,  are  suited  to  the  use  of  a  content  addressable 
memory.  Tasks  in  this  category  typically  involve  the  correlation  of  many  data  points  with 
a  common  reference  point,  e.g.,  target  tracking.  Figure  4-12  illustrates  an  associative 
processor  system,  where  the  single  RALU  in  a  uniprocessor  is  effectively  replaced  by  a 
stack  of  processing  units,  each  incorporating  a  serial  ALU,  "memory",  some  form  of 
autonomous  control  (CU)  and  input-output  circuits  (IOU)  -  or  in  other  words  a  microcompu¬ 
ter.  The  "memory"  in  each  processing  unit  is  normally  considered  as  one  long  word  (128 
or  256  bits)  divided  into  fields,  each  containing  a  specific  parameter  pertinent  to  the 
single  item  stored,  e.g.,  range,  azimuth  and  elevation  of  a  target. 

Availability  of  such  a  system  is  again  impaired  by  a  common  instruction 
stream,  which  feeds  the  associative  memory,  and  the  overall  uniprocessor  architecture. 
Speed  is  unsurpassed  for  correlation  type  tasks  in  a  multitarget  environment,  and  is 
unaffected  by  the  number  of  targets  within  the  storage  capacity. 


Figure  4-12  -  Associative  Processor  System 
4-11 
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4.1.6  ’’Hybrid"  Processor  Systems 

In  an  effort  to  achieve  high  speed  for  all  or  many  different  classes  of  prob¬ 
lems  within  a  central  computer  system,  systems  have  been  configured  to  handle  all  the 
tasks  encountered  in  tactical  air  defense  and  avionic  systems.  In  the  two  configurations 
reviewed,  a  distinction  is  made  between  tasks  amenable  to  highly  parallel  processing,  and 
the  remaining  irregular  tasks  which  are  better  suited  to  the  traditional  sequential  method 
employed  in  GP  uniprocessors. 

4.1.6. 1  Dual-Bus  External  Ensemble 


One  configuration  of  a  high-speed  multi-function  processor  is  shown 
in  Figure  4-13.  This  form  of  processing  system  (Reference  R-41)  employs  an  "ensemble" 
rather  than  an  array  of  processing  units  as  an  adjunct  to  a  GP  computer  which  performs 
the  irregular  sequential-oriented  tasks  and  furnishes  instructions  to  the  ensemble  for 
tasks  suited  to  parallel  processing.  A  common  global  control  unit  interfaces  with  the  GP 
computer  and  the  radar  subsystem  and  furnishes  operands  and  microinstructions  to  all 


PROCESSOR  ENSEMBLE 


1 


GLOBAL 
CONTROL  UNIT 


Figure  4-13  -  "Hybrid"  Processor  Systems  -  Dual  Bus 
External  Ensemble,  (PEPE) 
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’  processing  units  via  a  dual-bus  system,  one  dedicated  to  correlation  tasks,  the  other 

arithmetic.  Each  processing  unit  is  again  the  equivalent  of  a  microcomputer  with  a  512  x 
32-bit  word  memory.  Virtually  any  number  of  targets  can  be  processed  simply  by  adding 
more  processing  units  and  without  any  apparent  loss  in  throughput.  Availability,  however, 
is  again  jeopardized  by  the  dependence  on  a  single  GP  computer  and  global  control  unit. 

4. 1.6.2  Multiple-Bus  Integrated  Ensemble 

An  integrated  approach  to  the  ensemble  of  processors  is  exemplified 
in  the  system  shown  in  Figure  4-14.  In  this  system  (Reference  R-42),  the  host  computer 
is  eliminated  and  the  single  RALU  in  the  uniprocessor  configuration  is  again  effectively 
replaced  by  an  ensemble  of  sequential  uniprocessors  and  one  special-purpose  processor 
which  incorporates  a  FFT  processor,  associative  processor  and  pseudo-associative 
memory.  A  separate  bus  is  provided  for  access  to  the  bulk  storage,  main  store  and  master 
executive  control  (shared  bus),  and  the  IOU.  These  communications  busses  are  further 
augmented  by  an  interprocessor  bus  and  direct  links  between  the  special-purpose  proces¬ 
sor  and  the  bulk  store  matrix  providing  EDs  with  direct  access  to  the  sequential 
uniprocessors.  Each  uniprocessor  contains  a  2-4K  word  high-speed  memory  to  store  and 
process  large  routines.  These  routines  or  program  modules  are  transferred  from  the  main 
store  as  "burst"  transfers  under  the  direction  of  the  master  executive  control  (MEC),  the 
objective  being  to  reduce  activity  and  conflicts  on  the  bus  system  compared  to  conven¬ 
tional  multiprocessor  systems.  The  ensemble  is  in  many  respects  a  multibus, 
multicomputer  system  of  Figure  4-5  without  the  permanent  dedication  of  computers  to 
specific  sets  of  tasks,  except  in  the  case  of  the  MEC  which  is  virtually  a  host  computer. 
Availability  of  this  system  would  depend  on  the  duplication  of  the  MEC  and  other  critical 
programs,  for  immunity  against  memory  failures,  together  with  a  duplicate  special- 
purpose  processor.  The  complexity  of  the  parallel  multi-bus  communications  system 
represents  a  significant  deterrent  to  achieving  true  high  availability. 
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Figure  4-14  -  "Hybrid"  Processor  Systems 
Multiple-Bus  Integrated  Ensemble,  (Original  Navy  AADC) 


4.1.7  FFT  Processor  Architectures 

The  Cooley-Tukey  fast  Fourier  transform  (FFT)  algorithm,  (Reference  R- 
43),  has  been  widely  used  in  high-speed  real-time  signal  processing  since  its  introduction 
in  1965.  Although  the  FFT  algorithm  provided  a  dramatic  reduction  in  the  number  of 
arithmetic  operations  required  to  perform  frequency  spectrum  analysis,  it  nevertheless 
severely  burdened  the  throughput  capability  of  the  conventional  general-purpose 
uniprocessor.  As  a  result  of  the  latter  deficiency,  various  architectures  were  identified 
(Reference  44)  and  developed  providing  gigahertz  (GHz)  computational  rates  through  the 
use  of  pipelined  arithmetic  elements  within  the  arithmetic  unit(s)  (AU)  of  the  processor. 
Figure  4-15  illustrates  the  growth  from  a  single  sequential  uniprocessor  to  a  pipeline  of 
A  Us,  (one  for  every  major  iteration  in  the  FFT);  a  parallel  iterative  organization,  and  a 
full  array.  As  in  the  previous  architectures  reviewed,  it  is  conceivable  to  replace  each 
MEM/AU  combination  with  a  single-chip  microcomputer,  thereby  deriving  similar 
factorial  throughput  improvements,  based  upon  the  performance  of  the  yC. 
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I  I  ARRAY  (M/2  LOG2NA.U.»)*150  nSEC 

PARALLEL  ITERATIVE  (N/2  A.U.i)*900  nSEC 


•  64-POINT  COMPLEX  FFT,  150  nSEC /2-PT 

Figure  4-15  -  FFT  Processor  Architectures 
4.1.8  Conclusions 


It  becomes  apparent  that  the  classic  multiprocessor /computer  architectures 
reviewed  are  just  as  easily  implemented,  (if  not  more  easily  in  a  practical  sense),  with 
today's  single-chip  microprocessors/computers  as  with  the  smaller  scale  integrated  cir¬ 
cuits.  Of  the  several  types  discussed,  the  simplicity  of  the  single  time-shared  bus 
multiprocessor  configuration  suggests  its  viability  as  a  candidate  for  the  kernel  element 
of  a  modular  high-performance  architecture.  The  chief  deficiency  of  the  single  bus 
architecture  lies  in  the  shared  memory,  both  programs  and  data,  for  each  microprocessor 
and  the  resulting  high  incidence  of  conflicting  memory  accesses.  Further,  the  modular 
software  goal  is  defeated  through  shared  memory.  The  latter  deficiencies  were  therefore 
more  carefully  scrutinized  in  an  effort  to  adapt  the  basic  architecture  to  satisfy  both  high- 
throughput  and  modular  software. 
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5.  SUPER-FEDERATED  MICROCOMPUTER  SYSTEM  (SFMCS) 

As  was  stated  in  the  preceding  section,  the  single-bus  multiprocessor  architecture 
has  only  the  single  merit  of  simplicity  in  hardware  design,  but  if  solutions  could  be  found 
to  its  throughput  and  software  modularity  deficiencies  it  could  well  prove  to  be  an  ideal 
kernal  processing  element  for  modular  digital  missile  systems. 

5.1  Modifying/Optimizing  the  Single-Bus  Multiprocessor  Architecture 


Since  one  of  the  original  design  goals  was  to  perform  a  software  change  through  an 
integrated-circuit  (1C),  hardware  change,  i.e.,  identifying  each  major  functional  algorithm 
with  an  IC,  the  first  obvious  modification  to  the  conventional  single  bus  multiprocessor 
architecture  was  separation  of  program  memory  from  data.  Figure  5-1  illustrates  the 
change. 


Figure  5-1  -  Single  Time-Shared  Bus  Multi-Microprocessor  - 
Separate  Program  Memories 

Separating  programs  from  data  has  two  major  advantages: 

1)  Reduced  microbus  access  conflicts  and  hence  higher  throughput. 

2)  Each  major  program  module/algorithm  is  identified  with  a  PROM  IC. 
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'Hie  remaining  shared  data  memory  becomes  a  "mail  box"  for  the  transfer  of 
computed  results  from  one  major  algorithm  to  another,  and  stores  the  partial  results 
of  each  of  the  separate  programs  when  the  number  of  operands  exceeds  the  register 
storage  capacity  of  each  microprocessor. 


5.1.1  Memory  Mapping  and  Single  Computer  Programmability 

The  dedication  of  program  memories  to  specific  program  modules,  which  in 
turn  are  assigned  to  individual  microprocessors,  enables  each  processor  to  be  programmed 
as  an  entity  rather  than  as  part  of  a  complex  multiprocessor  system.  The  only  proviso  is 
the  establishment  of  a  common  memory  address  boundary  for  the  beginning  of  the  shared 
data  base.  This  base  address  must  exceed  the  highest  program  address  used  by  any  one  of 
the  group  of  microprocessors  in  the  system,  and  programming  can  then  proceed  as  for  a 
single  processor.  Figure  5-2  illustrates  this  memory  mapping  approach  for  four 
processors. 
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Figure  5-2  -  Super-Federated  Multiprocessor-Memory  Mapping  Class  in  Seeker  Processing 
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Program  growth  can  be  accommodated  for  each  function  up  to  the 
predefined  data  boundary.  Data  can  be  similarly  submapped  for  memory-mapped  I/O 
channels  as  well  as  partial  and  final  results.  Further,  if  several  similar  super-federated 
groups  are  controlled  by  a  host  processor,  a  third  upper  level  of  memory  space  can  be 
made  available  to  each  microprocessor  for  direct  communication  with  the  host  processor^ 
storage  space  (Figure  5-3). 


Figure  5-3  -  Super  Federated  Multiprocessor-Extended 
Memory-Mapping  For  Host  Processor 

5.1.2  High  Throughput  Refinements 


Returning  to  the  basic  single  bus  super-federated  multiprocessor  of  Figure  5- 
1,  the  frequency  of  data  memory  access  conflicts  becomes  a  function  of  the  degree  of 
synchronism  of  the  memory  fetch  cycles  of  each  microprocessor.  If  all  four  processors 
were  executing  identical  programs  and  were  driven  by  the  same  clock  waveform,  then  all 
memory  fetch  cycles  would  coincide.  This  situation  could  occur  if,  for  example,  those 
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processors  executed  the  three  autopilot  channels,  i.e.,  pitch,  roll  and  yaw,  respectively. 
At  the  other  extreme,  the  dissimilarity  of  individual  processor  programs  would  minimize 
fetch  cycle  conflicts,  as  would  the  time  skewing  of  clock  waveforms  by  one  memory 
access  interval  to  each  microprocessor.  The  latter  phasing  of  clock  pulses  would  then 
convert  the  simplistic  single  bus  multiprocessor  to  a  time-phased  ring  bus  architecture 
Reference  R-45.  Such  clock  time  phasing  would  eliminate  memory  access  conflicts  when 
identical  programs  are  being  executed  by  each  microprocessor  and  could  be  expected  to 
significantly  reduce  conflicts  among  dissimilar  instruction  sequences.  The  latter  could 
then  be  resolved  by  a  rotating  priority  scheme  executed  by  a  bus  controller.  Figure  5-4 
illustrates  the  above  timing  for  four  processors,  although  it  would  be  valid  for  more  using 
additional  clock  phases. 


Figure  5-4  -  Time-Phased  Ring  Bus  Memory /Instruction  Fetch  (F)  and  Execute  (E) 
Sequences  Four  Microprocessors  Identical  Instruction  Streams 
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5.1.3  Time-Phased  Ring,  Practical  Case 

To  be  effective  in  a  realistic  sense  the  foregoing  timing  refinements  must 
be  applicable  to  commercially  available  microprocessors.  The  Intel  8086  and  Zilog  Z-8000 
16-bit  microprocessors  were  selected  as  representative  of  the  state-of-the-art  in  single¬ 
chip  microprocessor  technology. 

5. 1.3.1  Intel  8086  Waveforms 


Figure  5-5  shows  the  compatibility  of  the  Intel  8086  timing  wave¬ 
forms  with  time-phased  clocking  of  each  microprocessor  clock  input.  Further  details  of 
the  latter  timing  relationships  are  given  in  Appendix  A. 
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5.1.3.2  Zilog  Z-8000  Waveforms 

The  Zilog  Z-8000  microprocessor  requires  a  minimum  of  three  clock 
cycles  to  output  the  memory  address  and  transfer  a  data  word  to/from  memory.  Figure 
5-6  shows  the  staggering  of  ring  micro /bus  accesses  through  the  use  of  a  four-phase  clock 
(Appendix  A). 

5.2  Expanded  System  Architecture 


Using  the  basic  four-microprocessor  module  (quad)  described  in  the  previous  para¬ 
graphs,  an  expanded  system  was  configured  using  four  quads  and  a  host  processor  (Figure 
5-7).  A  functional  block  diagram  of  this  system  is  given  in  Appendix  C. 

The  memory  map  for  this  system  is  as  shown  in  Figure  5-3.  Single  computer 
programming  simplicity  is  maintained  and  memory  access  conflicts  are  resolved  by  FPLA- 
based  arbitration  units,  transparent  to  the  programmer.  Input-output  activity  is  handled 
by  memory-mapped  I/O  modules  (ADAC  and  SDIO)  whose  data  is  directly  accessible  by 
any  microprocessor. 

I 


Figure  5-6  -  Time-Phased  Ring  Bus  -  Zilog  Z-8000  Timing  Compatibility 
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Figure  5-7  -  Expanded  Super-Federated  Microcomputer  System  For 
Physically  Centralized  Applications 


This  architecture,  using  16-bit  parallel  data  paths  between  quads  and  host,  provides 
high  throughput,  i.e.  up  to  10  MIPS,  depending  on  the  instruction  mix,  microprocessor  type 
and  the  amenity  of  the  task  to  be  distributed  among  the  processors.  In  the  configuration 
shown  for  missile  guidance  and  control  each  quad  contains  separable  subfunctions  of  the 
major  function.  The  host  random  access  memory  (RAM)  provides  a  common  mail  box 
store  for  all  four  quads,  e.g.,  for  the  transfer  of  "g"  commands  from  the  seeker  quad  to 
the  autopilot  quad  at  the  low  10-20  Hz  rate.  Such  a  system  provides  super-federation  of 
hardware  and  software  in  applications  where  all  the  processing  hardware  must  be  located 
in  one  place. 

5.3  Physically  Distributed  Systems 
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Similarly,  growth  from  a  low-performance  federated  guidance  and  control  system 
such  as  that  built  for  NSWC  (Figure  5-9)  to  a  high-performance  Class  III  system  could  be 
achieved  by  the  simple  substitution  of  a  SFMCS  quad  in  place  of  the  single  microprocessor 
(Figure  5-10). 

The  chief  difference  between  the  expanded  system  of  Figure  5-7  and  that  of  Figure 
5-10  is  the  connection  of  I/O  modules  to  the  quad  microbus,  as  opposed  to  the  host  proces¬ 
sor  microbus,  and  the  memory  mapping  of  I/O  RAMs  as  part  of  the  ring  RAM  data  base. 


Figure  5-9  -  Low-Performance  Federated  Microcomputer 
Missile  Guidance  and  Control  System  (NSWC  System) 
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Figure  5-10  -  High-Performance  Super-Federated  Microcomputer  System  For  Higher 
Performance  Missile  Guidance  and  Control  (NSWC  System) 
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6.  SFMCS  SOFTWARE 

As  stated  earlier  in  this  report,  the  primary  advantages  of  super-federated  com¬ 
puter  systems  for  on-board  missile  guidance  and  control  are  throughput  and  physical  com¬ 
patibility  with  the  modular  design  requirements  of  a  missile,  both  hardware  and  software. 
From  a  software  architecture  point  of  view,  there  are  certain  technical  trade-offs  which 
must  be  analyzed  to  achieve  the  primary  design  requirements  viz: 

1)  High  throughput 

2)  System  extensibility 

3)  Minimum  software  development  risk 

4)  Associative  software/hardware  modularity 

The  throughput  capabilities  of  a  super -federated  system  will  be  affected  by  the 
following  factors: 

•  Distribution  of  application  programs  throughout  system  memory 

•  Distribution  of  application  programs  among  the  processors  within  the 
federated  system 

•  Distribution  of  data  throughout  system  memory 

•  Selection  of  control  software  (network  executive) 

Support  of  system  extensibility  is  directly  related  to  the  amount  of  software 
modularity  which  can  be  supported  by  the  super-federated  system  architecture  and  the 
adaptability  of  the  control  software  to  the  changing  software  requirements. 

Reduction  of  software  development  risk  in  a  super-federated  system  is  related  to: 

•  Use  of  high  order  languages  which  support  concurrent  processing 

•  Independence  of  application  program  design  from  system  architecture 

•  Strict  adherence  to  software  modularity  guidelines. 
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The  close  interrelationship  of  software  and  hardware  modularity  is  viewed  as  a  key 
ingredient  in  the  efficient  management  of  software  development  and  maintenance 
throughout  the  systems  life  cycle. 

Hence,  it  is  apparent  that  the  successful  use  of  a  super-federated  system  is 
influenced  by  the  distribution  of  programs  and  data  throughout  the  system,  modularity, 
selection  of  control  software,  and  the  use  of  high  order  languages  to  support  concurrent 
processing. 

6.1  Distribution  of  Programs  and  Data 


The  distribution  of  programs  among  the  various  processors  and  system  memory  will 
directly  impact  system  throughput.  If  common  memory  were  used  to  hold  all  programs, 
it  is  obvious  that  the  SFMCSs  throughput,  or  any  tightly  coupled  distributed  system's 
throughput,  would  be  reduced  to  the  availability  of  that  memory  to  the  various 
processors.  In  general,  the  missile  environment  does  not  lend  itself  to  the  use  of  "large 
amounts  of  code  sharing"  by  various  software  functions.  For  example,  the  code  of  an 
autopilot  is  distinct  from  the  code  of  a  tracking  filter,  with  the  possible  exception  of  a 
service  routine  (possibly  a  matrix  manipulation  service).  For  this  reason  the  optimum 
layout  of  code  throughout  system  memory  is  through  the  use  of  a  local  processor  memory 
in  which  the  processor  does  not  compete  for  use  of  the  memory.  The  SFMCS's 
architecture  permits  maximum  throughput  by  its  use  of  local  memories  (i.e.  dedicated 
PROMs).  The  disadvantage  of  the  local  memory  design  is  a  slight  increase  in  total 
memory  due  to  the  possible  duplication  of  service  routines. 

We  traditionally  accept  target  tracking  logic  as  separate  from  a  guidance  law  or  an 
autopilot  with  limited  data  interfaces.  For  this  reason,  we  partition  the  functions  of 
missile  control  software  among  the  various  processors  within  the  system.  The  SFMCS 
offers  the  ability  to  partition  the  functions  within  a  quad  or  within  several  quads. 

But  the  distribution  of  a  function  (set  of  application  programs)  within  a  set  of 
processors  is  complicated  by  our  limited  ability  to  visualize  certain  software  functions  as 
parallel  (i.e.,  concurrent)  processes.  Traditionally,  a  guidance  law  or  a  tracking  filter  is 
presented  as  a  serial  process  as  illustrated  in  Figure  6-1.  Tlie  serial  process  is  typified  by 
the  use  of  a  parameter  calculated  in  the  previous  statement.  Tlie  challenge  is  to  partition 
a  function  so  as  to  maximize  throughput.  Figure  6-2  illustrates  the  type  of  partitioning 
which  must  be  used  to  distribute  the  example  in  Figure  6-1. 
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Figure  6-1  -  Typical  Serial  Process 
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The  development  of  software  tools  to  assist  in  this  partitioning  activity  is  a  neces¬ 
sity,  if  the  application  of  the  SFMCS  or  any  tightly  coupled  computer  system  is  to  mature 
to  wide  application  in  a  real-time  environment  as  throughput-demanding  and  complex  as 
missile  guidance. 

The  partitioning  of  data  will  impact  system  throughput  if  the  data  is  localized  in 
such  a  manner  as  to  cause  the  processors  in  the  system  to  wait  for  access  to  a  particular 
memory  unit.  In  addition,  throughput  is  adversely  affected  if  a  processor  must  wait  for 
data  to  be  available  before  it  can  continue  processing  (as  illustrated  in  Figure  6-2).  (Data 
consistency  is  obviously  an  important  consideration  in  any  partitioning  scheme.) 

The  SFMCS  permits  the  distribution  of  data  among  the  various  levels  of  memory  to 
minimize  memory  conflict.  If  we  take  the  example  in  Figure  6-2,  we  would  distribute  the 
parameters  as  illustrated  in  Figure  6-3. 

6.2  Modular  Software 

This  section  summarizes  the  intrinsic  characteristics  of  modular  software  as  they 
tend  to  impact  on  a  super-federated  microcomputer  system  architecture.  (References  R- 
46  and  R-47)  Desirable  modularity  features  are  as  follows: 


Figure  6-3  -  Distribution  of  Data 
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•  May  be  executed  as  an  independent  set  of  codes  given  proper  drive 
input. 

•  The  software  module  is  not  required  to  obtain  its  input  from  the  sys¬ 
tem  (i.e.,  do  I/O  operations),  rather  the  system  supplies  the  module 
with  data. 

•  The  code  may  be  transported  from  system  to  system  with  no  changes 
in  either  the  code  or  the  methodology  of  linkage  method. 

•  The  code  should  be  machine  and  system  independent. 

•  The  execution  of  the  code  should  be  system  thread  independent. 

•  The  module  should  be  able  to  be  replaced  with  a  stub  whereby  a  com¬ 
mand  response  is  a  given  rather  than  a  calculated  one. 

•  It  should  be  identified  as  a  set  of  logic  with  real  world  boundaries, 
i.e.,  a  PROM  integrated  circuit. 

•  If  the  module  is  an  I/O  driver,  its  methodology  linkage  to  the  system 
should  be  independent  of  the  specific  I/O  device. 

•  A  modular  system  is  one  in  which  units  of  standard  size,  design,  etc., 
can  be  arranged  or  fitted  together  in  a  variety  of  ways 
(implementation  independence). 

•  The  operating  system  needs  to  be  distributed  and  not  centralized. 
Centralization  and  modularity  are  diametrically  opposite  concepts. 

•  A  centralized  executive  is  one  which  is  highly  dependent  on  the 
implementation. 

•  A  hierarchy  of  distributed  control  enhances  the  modularity  concept. 

•  The  concept  of  local  autonomy  should  be  used  in  partitioning  the 
distributed  structure. 

•  A  centralized  operating  system  is  based  on  the  concept  of  a  sole 
source  issuing  directives  to  subordinate  tasks,  posting  requests  for 
and  dispatching  tasks  for  execution.  The  program  threading  is  gener¬ 
ally  controlled  by  the  supervisor.  In  distributed  control  tasks  run 
asynchronously  and  do  not  need  to  be  explicitly  dispatched.  The  local 
autonomous  control  program  has  sufficient  delegated  control  to 
determine  whether  a  task  should  be  executed  or  other  subordinate 
tasks  dispatched. 
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For  example,  consider  a  conventional  OS  dispatcher  function.  Generally  an 
external  stimuli  (interrupt)  or  a  task  complete  causes  the  OS  to  schedule  a  task  to  run, 
the  dispatcher  is  next  called  to  the  control  point  to  dispatch  the  next  task  according  to 
priority,  resource  availability,  etc.  Generally  an  explicit  directive  from  the  OS  initiates 
the  task. 

In  a  distributed  control  task,  managers  have  access  to  the  task  queue  and  deter¬ 
mine  whether  or  not  the  task  can  be  run.  There  will  be  some  duplication  of  software  in 
this  structure,  however  this  is  the  penalty  for  modularity.  For  tasks  that  are  truly 
autonomous  and  thread  independent,  they  can  run  continuously.  An  example  would  be  a 
continuous  A/D  converter  driving  a  memory  mapped  I/O  System.  The  system  need  not 
command  the  conversion  since  it  is  being  performed  continuously.  If  we  were  to  elevate 
the  level  of  control  to  an  autopilot,  for  example,  then  a  continuous  autopilot  calculation 
would  be  performed.  One  key  to  modular  software  is  therefore  the  linkage  mechanism. 

6.2.1  Table  Driven  Software  Modules 

The  communications  between  the  operating  system  or  real-time  executive 
and  the  modular  software  is  performed  through  messages  prepared  by  the  OS  and  depos¬ 
ited  in  a  memory  space  common  to  both  the  OS  and  the  modular  software.  This  message 
may  contain  such  parameters  as  the  location  of  the  data  to  be  operated  on,  the  task  to  be 
performed,  explicit  data  fields  identifying  where  to  deposit  results,  a  field  to  ascertain 
equipment  or  program  status  and  a  variety  of  parameters  necessary  to  execute  the  task  at 
hand.  The  linkage  may  not  only  contain  data  necessary  for  the  execution  of  a  single 
point,  sequential  task,  but  may  also  contain  a  set  of  instructions  the  OS  is  requesting  the 
servicing  device  to  perform.  The  preparation  of  the  table  is  not  restricted  to  the  operat¬ 
ing  system  but  may  be  loaded  by  other  processors  passing  data  or  control  to  the  next 
processor  in  the  system  thread. 

This  method  of  linking  programs  has  several  "buzz  words"  associated  with  it 
which  include:  packet-directed  procedures,  linked  control  blocks,  table  driven  software, 
semaphore  control,  task  block  and  control  program  generation. 

Several  methods  to  bring  a  software  module  up  to  the  control  point  and 
cause  it  to  go  into  execution  are  available.  These  methods  include: 
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•  A  direct  call  from  the  OS  with  argument  pointers  to  the  table 
directing.  This  may  be  termed  synchronous  execution. 

•  A  subprogram  may  be  polling  a  directive  table  when  the  OS  or 
other  programs  deposit  an  execution  directive.  Execution 
commences.  This  mode  of  operation  may  be  termed 
asynchronous. 

•  Direct  hardware  interrupt  is  applicable  primarily  in  multiple 
CPU  configurations,  where  a  single  vector  interrupt  is  used  to 
"wake  up"  an  idling  program  and  cause  execution  to 
commence. 


6.2.2  Composition  of  Software  Modules 

The  module  is  composed  with  two  primary  nodes:  a  computational,  and  a 
anagement.  The  subdivisions  of  this  partitioning  include  the  following  elements: 

6.2.2. 1  Computational Node 

1)  Functional  Description  -  A  mathematical  or  algorithmic 
description  of  the  processing  requirement. 

2)  Data  Integrity  -  It  is  the  responsibility  of  the  calling  procedure 
to  insure  the  data  validity  before  invoking  this  procedure. 

3)  Data  Source/Determination  -  The  location  of  source  data, 
placement  of  transitory  variables  and  destination  of  the  result¬ 
ant  data  shall  be  specified  in  the  procedure  as  a  part  of  the 
calling  linkages. 

6. 2. 2. 2  Management  Node 

The  following  management  (local)  functions  are  identified  as: 

1)  Identification  of  any  subprocedures/subroutines 
invoked. 

2)  Identification  of  internal  data  for  control  purposes. 
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3)  Identification  of  any  resources  shared  by  the  system. 

4)  Providing  the  linkage  to  the  operating  system  to 
allocate  these  resources  to  this  subprocedure. 

Modular  software  or  configuration  independence  requires: 

•  The  need  for  functional  interface  definition 

•  Flexibility  for  growth 

•  Changes  in  configuration  do  not  necessarily 

mean  changes  in  code  (repercussion  effects) 

•  Ability  to  introduce  another  subsystem  without 

disturbing  the  entire  system 

•  Keep  specification  and  top  level  design  independent  of 
implementation  when  possible. 

Configuration  dependence  requires: 

•  Interdependence  of  elements 

6. 2. 2.3  Definition  of  a  Control  Block  (High  Level)  Task  Switching 

For  each  process,  the  software  system  defines  a  control  block  which 
represents  that  task  to  the  system  and  through  which  system  and  process  interaction  is 
performed.  It  represents  a  place  for  the  representation  of  any  relationship  between  the 
process  and  processes  that  have  invoked  it.  It  provides  a  place  for  the  description  of 
events  that  must  be  completed  before  the  task  is  to  operate.  It  provides  a  place  for 
pointers  to  other  system  control  blocks  which  represent  both  the  allocation  of  memory 
and  devices  to  the  process. 

6. 2. 2.4  Definition  of  Reentrancy 

A  reentrant  island  of  code  is  one  in  which  no  changes  occur  as  the 
result  of  execution  at  any  time.  All  parameters  are  passed  to  it  and  to  all  intermediate 
values  that  it  develops.  All  results,  etc.,  are  considered  to  be  objects  external  to  the 
code  itself.  (NOTE:  Common  system  subprograms  and  subroutines  should  be  reentrable.) 
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6.3  SFMCS  Control  Software 

While  the  individual  microprocessors  of  the  SFMCS  are  standard  industry  devices 
supplied  with  conventional  support  software,  and  each  processor  can  be  programmed  as  an 
entity  using  a  predefined  memory  map,  nevertheless  the  expanded  architecture  of  Figure  5- 
using  a  host  processor  presents  various  options  for  system  control. 

Possible  methods  of  software  control  of  the  Super-Federated  Microprocessor  Sys¬ 
tem  (SFMCS)  are  similar  to  those  available  to  multiprocessors  and  tightly  coupled  distrib¬ 
uted  systems.  The  methods  available  are: 

1)  Master/Slave: 

2)  Floating  Executive  (Decentralized)  Polling 

3)  Floating  Executive  with  Multiprogramming. 

6.3.1  Master /Slave 

The  master/slave  has  a  master  processor  in  the  federated  system  which 
controls  the  processing  carried  out  by  the  other  (slave)  processors  in  the  system.  Essen¬ 
tially  the  scheduling  and  dispatching  of  tasks  within  the  SFMCS  is  carried  out  by  the 
master  processor.  The  master/slave  is  a  hierarchical  configuration  with  two  levels  of 
hierarchy  in  which  the  host  (master)  performs  all  of  the  task  scheduling  and  dispatching 
for  the  satellite  (slave)  processors. 

It  should  be  noted  that  in  general  the  master /slave  relationship  is  independ¬ 
ent  of  the  method  of  communications  among  the  processors.  The  communications 
between  processors  in  the  SFMCS  can  be  implemented  through  memory  (the  slave  polls  a 
memory  location  for  control  information  from  the  master)  or  through  a  positive  control 
signal  (e.g.,  an  interrupt)  between  processors. 

The  cascading  memory  feature  (the  ability  of  processors  within  the  SFMCS 
to  directly  access  various  levels  of  memory)  of  the  SFMCS  permits  easy  implementation 
of  a  memory  polling  scheme.  The  control  software  would  be  located  in  the  system 
memory  (Figure  6-4). 
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Figure  9-4  -  SFMCS,  Host  and  Dual-Quad  Configuration 

The  polling  scheme  has  two  main  drawbacks: 

•  Changing  of  a  slave  processor  task  cannot  be  accomplished 
after  the  slave  processor  has  started. 

•  Use  of  system  memory  becomes  extremely  high  when  slave 
processors  are  polling. 

Figure  6-5  is  a  simplified  illustration  of  the  polling  control  software  required  in  a  slave 
processor. 

The  use  of  a  control  signal  between  processors  would  eliminate  the 
drawbacks  associated  with  the  polling  scheme.  The  use  of  a  control  signal  between 
processors  would  increase  the  complexity  of  both  the  software  and  the  hardware  in  the 
SFMCS.  Figure  6-6  is  a  simplified  illustration  of  a  slave  processor  controlled  by  a  control 
signal. 

The  selection  of  a  master/slave  control  system  for  the  SFMCS  implies  that 
the  application  requires  the  assignment  of  multiple  task  to  the  individual  processors  and 
further  implies  that  the  task  assignments  must  be  synchronized. 
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Figure  6-5  -  Master /Slave  (Polling) 
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Figure  6-6  -  Master /Slave  (Control  Signal) 
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6.3.2  Floating  Executive  (Polling) 

This  type  of  control  software  is  essentialllv  the  same  as  the  polling 
master/slave  without  one  processor  providing  synchronization  of  the  tasks. 

The  polling  floating  executive  is  a  simple  routine  if  a  processor  is  required 
to  perform  one  task.  If  processors  are  required  to  perform  more  than  one  task,  the 
polling  floating  executive  must  have  the  ability  to  stack  tasks  for  processors.  The  ability 
to  stack  tasks  would  increase  the  control  software  complexity  significantly. 

The  polling  floating  executive  suffers  from  the  same  drawbacks  as  the 
polling  master/slave  software. 

Figure  6-7  illustrates  a  simplified  version  of  the  polling  floating  executive 
(without  stacking). 
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6.3.3  Floating  Executive  (Multiprogrammed) 

If  the  application  of  the  SFMCS  requires  that  the  various  processors  in  the 
system  respond  to  multiple  external  stimuli  on  a  priority  basis,  the  multiprogrammed 
floating  executive  will  provide  the  quickest  response.  The  major  drawbacks  of  the 
multiprogrammed  floating  executive  are  complexity  and  size  of  the  control  software. 
Figure  6-8  illustrates  a  multiprogrammed  floating  executive. 

It  should  be  noted  that  the  multiprogrammed  floating  executive  requires  a 
control  signal  between  the  processors  in  the  system  (more  complex  hardware). 

In  order  to  select  the  control  software  for  an  application  in  which  an  SFMCS 
is  to  be  used,  the  various  methods  of  software  control  can  be  evaluated  based  on  the  fol¬ 
lowing  criteria: 


•  Response  Time 

•  Throughput 

•  Complexity 

•  Extensibility* 

•  Size  (Memory  Requirements) 

•  Development  Cost 

•  Partitioning  Visibility  ** 

Table  6-1  presents  a  comparison  of  the  various  control  software  methods  as 
applied  to  the  SFMCS. 


•Extensibility  is  the  ability  to  modify  the  functions  of  the  system  without  requiring 
changes  to  the  system  design. 

•♦Partitioning  visibility  is  the  amount  of  knowledge  that  the  applications  programmer 
must  have  of  where/how  the  various  functions  are  partitioned  in  the  system. 
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COMPARISON  OF  SFMCS  SOFTWARE  CONTROL  METHODS 
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6.4  High  Order  Language/Advanced  Software  Tools 

It  is  evident  that  the  distribution  of  software  (both  programs  and  data)  within  a 
super-federated  system  requires  careful  analysis  of  the  software  functions  to  develop  a 
partitioning  scheme  which  will  produce  the  required  throughput  and  satisfy  the 
associative  software/hardware  modularity  goal.  A  manual  partitioning  scheme  would,  in 
most  cases,  result  in  a  software  architecture  which  would  not  follow  modularity 
guidelines. 

Two  items  appear  to  be  desirable  to  support  development  software  for  a  super- 
federated  or  tightly  coupled  system.  The  first  item  would  be  a  concurrent  high  order 
language  which  would  be  efficient  enough  to  support  real-time  processing  and  relieve 
application  programmers  of  the  requirement  of  understanding  the  details  of  the  hardware 
configuration.  The  current  Ada  development  could  conceivably  lead  to  solutions  to  these 
requirements.  The  second  item  required  to  support  software  development  is  a  computer 
aided  partitioning  system.  Such  a  system  would  aid  the  system's  programmers  to  define 
the  hardware  architecture,  e.g.  number  and  configuration  of  quads,  and  evaluate  various 
software  partitioning  schemes. 
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7.  SFMCS  SIMULATION  MODELING  (Expanded  System) 

The  validation  of  the  expanded  SFMCS  architecture  (Figures  5-7,  C2  and  C3), 
by  simple  simulation  techniques  was  explored  and  in  the  course  of  establishing  a  model, 
the  significance  of  each  element  in  the  system,  in  terms  of  its  effect  on  throughput, 
was  determined. 

The  major  activity  in  the  expanded  system  occurs  in  each  quad  since  the  host 
simply  provides  the  means  of  initializing  the  system  and  interfacing  it  with  the  analog  and 
serial  digital  data  sources/users.  The  latter  occurs  virtually  autonomously  through 
memory-mapped  I/O  channel  buffers.  Further,  traffic  between  host  and  quads  for  missile 
applications  is  relatively  light  both  in  quantity  and  frequency.  The  detailed  timing  analy¬ 
sis  for  the  quad  time-phased  ring  is  given  in  Appendix  A,  and  this,  in  many  ways,  preempts 
the  need  and  effectiveness  of  a  higher  level  simulation.  However,  the  structure  of  the 
expanded  system  using  several  quads  and  a  host  computer  was  characterized  before  the 
detailed  timing  analysis  of  the  quad  was  performed. 

Major  elements  of  the  system  model  are  shown  in  Figure  7-1.  These  are  developed 
as  follows: 

1)  Microprocessor  (CPU)  Model 

2)  Memory  Address  Translator  Model 

3)  Priority  Resolution  Model 

4)  Microbus  Model 

5)  Memory  Model 

7.1  Microprocessor/CPU  Model 

Figure  7-2  shows  the  simulation  model  developed  for  the  microprocessor/CPU. 

Instructions  are  classified  according  to  type,  memory  access,  local,  wait  states  and 
extended  addressing. 
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Figure  7-1  -  SFMCS  Major  Elements  and  Timing 
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Figure  7-2  -  Microprocessor  (CPU)  Model  Flow  Diagram  (SA1) 
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Figure  7-2  -  (Cont.) 
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Move  Data 

•  Memory  operation  Y,  N? 

•  Data  location  Mem  A,  Port  no.=? 

B 

C 

•  Memory  fetches/stores/instruction 

•  Total  clock  cycles/instruction 

•  Probability  of  an  indexed  instruction 

•  Number  of  repeats  ^ 

•  Next  source  of  instruction  field 

A  typical  example  stated  in  high  order  language  (HOL)  would  be: 

MD,  Mem  A,  P3,  4,  17,  .3,  2,  L 

Which  means  Move  data  to/from  memory.  Specifically: 

From  Memory  A,  Port  3 

The  number  of  memory  access  =4 

The  dwell  time  for  this  instruction  is  17 

The  probability  the  instruction  is  indexed  is  0.3 

Repeat  this  instruction  twice 

What  happens  is  as  follows: 

1)  The  CPU  model  decodes  an  MD  operation 

2)  The  timing  for  the  memory  access  is  determined 
17  clock/4  machine  eycles/state  =  4.X  cycle 

Since  there  are  now  four  memory  accesses,  1  wait  state  is  generated  which  sched¬ 
ules  execution  as  follows: 

Ti  T2  T3  T4  Ti  T2  T3  T4  Ti  T2  T3  T4  T4  T2  T3  T3  T4 
♦  +  t  t  ^ 

MEM  OP  MEM  OP  MEM  OP  MEM  OP  1  WAIT  due  Prog. 
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If  the  memory  is  not  ready  at  T3  then  wait  states  are  injected. 

3)  A  request  for  access  to  Mem  A  P3  is  sent  to  the  address  translator. 
The  translator  determines  which  bus  the  data  is  on  and  queues  a 
request  for  Mem  A  P3  to  the  proper  priority  resolver. 

The  priority  resolver  sends  a  request  for  a  memory  operation  to  Mem  A  P3.  When 
data  is  ready  for  CPU  #X,  a  ready  flag  is  set  and  the  CPU  model  continues  by  fetching  the 
next  instruction  from  the  source  specified  (Local  Memory)  and  repeats  the  next  instruc¬ 
tion  (decrements  the  repeat  counter  by  1  and  continues).  Other  categories  of  instruction 
can  be  defined  in  a  similar  manner. 

7.2  Address  Translation  Model 

This  model  (Figure  7-3)  receives  requests  for  memory  and  determines  which  bus 
access  model  to  send  the  request  to. 

7.3  Priority  Resolution  (Bus  Access) 

This  device  is  in  reality  a  priority  resolver  (Figure  7-4).  Its  inputs  are  requests  for 
bus  service  and  its  output  is  a  request  to  the  bus  model  for  a  transaction. 

The  form  of  the  request  is: 

Device  #X  requests  service  of  bus  #. 

The  priority  scheme  is  variable.  It  may  be  fixed,  head  to  tail,  i.e.,  the  next  priority 
is  dependent  on  the  previous  device  grant.  It  may  be  fixed  cycle,  the  cycle 
may  rotate  with  time  independent  of  access,  and  other  methods  may  be  used.  The  point 
here  is  to  identify  the  bus  access  model  as  an  entity  that  it  may  be  attached/detached, 
from/to  any  bus  subsystem. 
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INPUT  TABLE 

1  REQUESTOR  CPU  X 

2  MEMORY REO 

3  f  UNCTIONAl  ENTRY 

4  DISPLACEMENT 

5>  RESULT  DEPOSIT 

6  R  YY  MEMORY 


MEMORY  XL  AT  ION  TABLE 


MEMORY  REO 

LOCAL 
CACHE 
MAIN  A 
MAIN  B 
MAIN  C 
01  ORAL  I  O 
LINK 


BL»S  ASSIGNED 

~0 

1 

2 

3 

4 


BUS  TABLE  REO 

LOCFlPR  10) 
LOCF  I  PR  121 
LOCE  (PR  13) 


THE  SE  T  ABLES  ARE 
TOR  PRIORITY  ENCODER 
INPUTS 


THE  PRIORITY  TABLE  IS 
ORDE  RE  D  AS 


CPU  il>  SENTRIES  •  •  •  CPU  1  REQUESTING 
R  W 

F  UNCT  ION AL  E  NT  RY 
DISPLACE  ME  NT 
RESLILT  DEPOSIT 


CPU  ,2*  SENTRIES 


AS  MANY  CPU  S  AS  CONNECTED  TO  THIS 
PRIORITY  \<E  T WORK 

SY  SI  i  M  PRIORITY  TABLE  BEGINS 

PRIORITY  UNIT  A  BEGIN  ADDRESS  A  t SO  ENTRIES) 
B  B 

C  C 

D  D 

E  E 

T  HIS  T  ABl  t  miSTHI  ADORE  SS  \l  AT  OR  MODE  L  S 
W HE  RE  THE  T  ABl  I  S  BEGIN 


Figure  7-3  -  Memory  Address  Translator  Model  Flow  Diagram 
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M  HIGHEST  PRIORITY 


Figure  7-4  -  Priority  Resolution  Model  Flow  Diagram 
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7.4  Microbus  Model  (Bus  Model) 

Given  there  are  a  number  of  buses  in  the  system,  a  bus  model  is  required.  All  bus 
models  need  not  be  the  same.  They  are  characterized  by  the  number  of  devices  con¬ 
nected  to  them  and  their  cycle  time,  i.e.,  time  to  bus,  time  from  bus.  (Program  Linkage 
Model)  Table  driven,  interrupt  driven  (I/O  Model). 

This  is  a  relatively  simple  model  as  shown  in  Figure  7-5. 

7.5  Memory  Model 

Each  memory  model  has  an  access  control  determined  by  the  number  of  users  con- 
nectedtoit  (Figure  7-6).  Part  of  this  access  control  is  a  priority  network  which  can  be  identi¬ 
fied  as  a  priority  model.  The  one  shown  is  a  dynamic  rotating  priority.  Other  priority 
schemes  may  be  used  such  as  fixed  linear  select. 

The  priority  module  has  a  cycle  time  associated  with  it.  The  output  of  the  priority 
network  results  in  a  request  for  a  certain  memory  bank  operation  In  the  case  of  local 
memory  this  model  will  be  a  demand  type  i.e.  since  it  is  the  only  device  using  the  memory 
no  priorities  are  involved.  In  the  case  of  Cache  4,  devices  may  request  service.  In  the 
case  of  Memory  Bank  4,  9  devices  may  access  this  one.  In  general,  the  priority  model 
should  be  a  separate,  detatched  entry  independent  of  any  system  configuration.  The  idea 
is  to  be  able  to  attach  different  priority  models  to  different  memories. 
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Figure  7-6  -  Memory  Model  Flow  Diagram 
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APPENDIX  A 

SUPER-FEDERATED  MICROCOMPUTER  SYSTEM  (SFMCS) 
TIMING  AND  THROUGHPUT  ANALYSIS 


A1  Introduction 


'  In  order  to  determine  the  throughput  capability  of  the  SFMCS  architecture,  a  repre¬ 
sentative  state-of-the-art  commercial  microprocessor  was  selected  and  its  performance 
evaluated  as  a  single  processor  using  a  realistic  avionics  instruction  mix.  The  perform¬ 
ance  of  the  SFMCS  quad  multiprocessor  was  then  determined  using  the  above  microproces¬ 
sor  and  instruction  mix,  applied  to  four  quad  configurations  viz: 

1)  Basic  shared-memory  multiprocessor  (Figure  4-1); 

2)  Shared-memory  multiprocessor  using  the  time-phased  ring  technique. 

3)  Dedicated  microprocessor  program  memories  and  shared  data 

memory  without  time  phasing. 

4)  Dedicated  microprocessor  program  memories  and  shared  data 

memory  with  time  phasing. 

In  each  of  the  above  cases,  the  performance  improvement  of  the  quad  versus  the 
single  processor  was  noted. 

Lastly,  the  performance  of  Configuration  4  was  determined  using  different  avionic 
instruction  mixes  in  each  microprocessor. 
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A2  Calculation  of  Intel  8086  Throughput 

There  are  several  established  avionic  instruction  mixes  which  are  representive  of 
various  guidance  and  control-type  algorithms.  These  mixes  are  shown  in  Table  A-l. 


TABLE  A-l 

CANDIDATE  INSTRUCTION  MIXES  (%) 


STANDARD 

AIRBORNE 

F4  FIRE 
CONTROL 

F15  AUTO 
flight 

CONTROL 

R.F4 

INERTIAL 

NAV 

MOVE 

45 

22 

41 

45 

ADD/SUB 

9 

17 

19 

9 

MULTIPLY 

5 

17 

4 

<1 

DIVIDE 

.2 

4 

- 

<1 

SHIFT 

5 

2 

3 

8 

LOGICAL 

5 

4 

10 

13 

TEST  &  BRANCH 

30 

32 

21 

24 

I/O  CONTROL 

1 

2 

2 

- 

The  Intel  8086  has  over  100  basic  instructions.  Within  the  basic  instructions,  a 
variety  of  options  may  be  used  to  perform  the  same  basic  operation.  In  addition,  several 
different  addressing  modes  may  be  used  in  the  instruction. 

The  execution  time  of  an  instruction  is  therefore  a  sum  of  the  contributions  due  to 
basic  type,  option  selected,  and  address  mode.  A  method  of  weighted  averages  will  be 
used  that  considers  not  only  the  above  mentioned  factors  but  also  includes  a  usage  weight 
as  well.  An  analysis  of  the  move  instruction  will  be  used  to  illustrate  the  technique. 
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There  are  several  types  of  move  instruction  which  are  as  follows: 

Mnemonic:  MOV 

Description:  MOV  performs  a  byte  or  word  transfer  from  a  specified  source 
to  a  specified  destination. 


Encoding: 

Memory  or  Register  to/from  Memory  or  Register: 


lOOOlOdw  mod  reg  r/m 


Percent  Usage 


if  d  =  1  then  SRC  =  EA ,  DEST  =  REG 
ebe  SRC  =  REG,  DEST  =  EA 


Timing  (clocks) :  register  to  register  2 

memory  to  register  8+EA 

register  to  memory  9+EA 

Immediate  Operand  to  Memory  or  Register: 


10 

5 

5 


1  1  0  0  0  1  1  w  mod  0  0  0  r/m  data  data  if  w=l 


SRC  =  data ,  DEST  =  EA 

Timing  (clocks) :  Immediate  to  register  4  10 

Immediate  to  memory  10+EA  5 

Immediate  Operand  to  Register: 

1  0  1  1  w  reg  data  data  if  w=l 

SRC  =  data ,  DEST  =  REG 

Timing:  4  clocks  12  15 
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Memory  Operand  to  Accumulator: 


1  0  1  0  0  0  0  w  addr-low  addr-high 

if  w  =  0  then  SRC  =  addr,  DEST  =  AL 

else  SRC  =  addr+l:addr,  DEST  =  AX 

Timing:  10  clocks  30 

Accumulator  Operand  to  Memory: 

lOlOOOlw  addr-low  addr-high 

if  w  =  0  then  SRC  =  AL,  DEST  =  addr 

else  SRC  =  AX ,  DEST  =  addr+1  :addr 

Timing:  10  clocks  25 

Memory  or  Register  Operand  to  Segment  Register: 

10001110  mod  0  reg  r/m 


if  reg  =  01  then  SRC  =  EA,  DEST  =  REG 
else  undefined  operation 
Timing  (clocks) :  register  to  register 
memory  to  register 


8+EA 


Segment  Register  Operand  to  Memory  or  Register: 


10001100  mod  0  reg  r/m 
SRC  =  REG,  DEST  =  EA 
Timing  (clocks) :  memory  to  register 
register  to  register 


9+EA 
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Percent  Usage 


Operation: 


UNCLASSIFIED 


(DEST)  <  ==  (SRC) 

Flags  Affected: 

None 

A  number  of  addressing  modes  are  available  that  are  used  to  calculate  the  effective 
address  (EA).  These  times  are  as  follows: 

Effective  Address  Timing 


Addressing  Mode 


Percent  Used 


No  EA  calc  required 

0  clocks 

25 

Direct  16-bit  offset  address 

6  clocks 

50 

Indirect  through  base  or  index 
register  (BX,  BP,  SI,  DI) 

5  clocks 

10 

Indirect  through  base  or  index 
register  with  displacement 

constant 

9  clocks 

8 

Indirect  through  sum  of  in<?  x 
register  plus  base  register 

7  or  8  clocks 

5 

Indirect  through  sum  of  base 
register  plus  index  register 
with  displacement  constant 

11  or  12  clocks 

2 
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Now  the  weight  average  for  EA  is  calculated  as: 


Address  Mode 

Clocks 

Percent  Used 

100/ASTR  Clocks  Total 

1 

0 

25 

25 

25  0 

2 

6 

50 

50  300  300 

3 

5 

10 

10 

50  50 

4 

9 

8 

8 

72  72 

5 

7.5 

5 

5 

37.5  37.5 

6 

11.5 

2 

2 

23  23 

482.5 

Time  for  EA  calc 

482.5  clocks  x  200  nsec/clock 

100 

EA  calculations 

EA  time  =  .965  y  sec  or  4.825  clocks 

In  order  to  calculate  the  weighted  average  for  the  move  instruction,  the  weighted 

average  of  the  MOV  instruction  is  determined. 

Here  we  have: 

Operation 

Clocks 

Percent  Used 

Total  Clocks 

Instruction  #1 

r-r 

2 

10 

20 

m-r 

8+4.82 

5 

64.1 

r-m 

9+4.82 

5 

69.1 

Instruction  #2 

I-r 

4 

10 

40 

I-m 

5+4.82 

5 

49.1 

4 
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Instruction  #3 


15 


60 
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Operation 

Clocks 

Percent  Used 

Total  C 

Instruction  #4 

10 

20 

200 

Instruction  #5 

10 

20 

200 

Instruction  #6  r-r 

2 

2 

4 

m-r 

8+4.82 

3 

38 

Instruction  #7  m-r 

9+4.82 

2 

27 

r-r 

2 

3 

6 

778.34 


778.34  x  200 

Move  time  =  - —  =  1.556  psec  or  7.78  clocks 

100 


A2.2  Add/Subtract  Instructions 

The  six  types  of  these  instructions  are: 

1)  Memory  or  Register  Operand  with  Register  Operand 
Operation  Clocks  Percent  Used  Total  Clocks 


r-r  3  10 

m-r  9+4.82  5 

r-m  16+4.82  5 

2)  Immediate  Operand  to  Memory  or  Register  Operand 

I-M  17+4.82  10 

I-r  4  20 
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30 

69.1 

104.1 


218.2 

80 
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Operation 


Clocks 


Percent  Used 


3)  Immediate  Operand  to  Accumulator  Operand 


4)  Add  with  Carry 


9+4.82 

16+4.82 


5)  Immediate  Operand  to  Memory  or  Register  Operand 


6)  Increment 


17+4.82 


Total  Clocks 


47.41 

104.1 


109.1 


15+4.82 


986.11 


986.11  x  200  nsec 


Add/Subtract  time  = 


Add/Subract  time  =  1.972  psec  or  9.86  clocks 


A2.3  Multiply  Instruction 


Multiply  (unsigned)  30  percent  (usage) 
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Clocks 

Clock 

Percent 

Total  Clocks 

8  bit  register 

(70-77) 

73.5 

5 

367.5 

8  bit  memory 

(76-83) +4. 82 

84.3 

5 

421.5 

16  bit  register 

(118-133) 

125.5 

10 

1,255.0 

16  bit  memory 

(124-139)^4.82 

136.3 

10 

1,363.0 

Multiply  (Integer)  70  percent  (usage) 

8  bit  register 

(80-98) 

89 

10 

890 

8  bit  memory 

(86-104) -*‘4. 82 

99.82 

10 

998.2 

16  bit  register 

(128-154) 

141 

25 

3,525.0 

16  bit  memory 

(134-160)+4.82 

151.82 

25 

3,795.5 

12*615.7 

12,615.7  x  200  nsec 

Multiply  time  = 

Multiply  time  = 

25.231  ysec  or  126.15  clocks 

A2.4  Divide  Instruction 

Divide  (unsigned) 

30  percent  (usage) 

Clocks 

Clocks 

Percent  Used 

Total  Clocks 

8  bit  register 

(80-90) 

85 

5 

425 

8  bit  memory 

(86-96)+4.82 

95.8 

5 

479 

16  bit  register 

(144-162) 

153 

10 

1,530 

16  bit  memory 

(150-168)+4.82 

163.82 

10 

1,638.2 

UNCLASSIFIED 


Divide  (Integer)  70  percent  usage 


8  bit  register 

(101-112) 

106.5 

10 

8  bit  memory 

(107-118)+4.82 

117.3 

10 

16  bit  register 

(165-184) 

174.5 

25 

16  bit  memory 

(171-l&0)+4.82 

185.3 

25 

Divide  time  = 


15,305.2  x  200  nsec 


100 


Divide  time  =  30.61  ysec  or  153.05  clocks 
A2.5  Shift  Instruction 


Shift  logical  left  (25  percent)  usage 


Single  bit  reg 
Single  bit  mem 
Var  bit  reg 
Var  bit  mem 


15+EA 

8+4/bit 

20+EA+4/bit 


15+4.82 

5 

24 

10 

65.64 

10 
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1,065.0 

1,173.0 

4.362.5 

4.632.5 


15,305.2 


10 

99.1 

240 

328.2 


667.3 


£ 

I 

i. 

UNCLASSIFIED 

- a 

Shift  logical  right  (25  percent)  usage 

Single  bit  reg. 

2 

2 

5 

10 

Single  bit  mem 

15+EA 

15+4.82 

5 

99.1 

Var  bit  reg. 

8+4/bit 

24 

10 

240 

Var  bit  mem 

20+EA+4/bit 

65.64 

5 

328.2 

667.3 

Shift  Arithmetical  (25  percent)  usage 

Single  bit  reg 

2 

2 

5 

10 

Single  bit  mem 

15+EA 

15+4.82 

5 

99.1 

Var  bit  reg 

8+4/bit 

24 

10 

240 

Var  bit  mem 

20+EA+4/bit 

65.64 

5 

328.2 

667.3 

Rotate 

(25  percent)  usage 

; 

i 

Single  bit  reg 

2 

2 

5 

10 

s 

Single  bit  mem 

15+EA 

15+4.82 

5 

99.1 

Var  bit  reg 

8+4/bit 

24 

10 

240 

Var  bit  mem 

20+EA+4/bit 

65.64 

5 

328.2 

t 

1 

667.3  X  4  = 

2709.2 

V 

i 

i 

1 

2709.2  x  200  nsec 

1 

Shift  time  = 

— - -  = 

5.418  psec  or  27.09  clocks 

' 

100 

. 

* 

V 

* 

i 

A-ll 

1 

UNCLASSIFIED 

H 

_ 

UNCLASSIFIED 


A2.6  Logical  Instructions 


Exclusive  OR  (25  percent)  usage 


Percent 

Total  Clocks 

(1) 

r-r 

3 

15 

45 

(2) 

m-r 

9+4.82 

2 

18.96 

(3) 

r-m 

16+4.82 

2 

41.64 

(1) 

I-r 

4 

2 

8 

I-m 

17+4.82 

2 

43.64 

(1) 

I-r 

4 

2 

8 

AND  50  percent  usage 

(1) 

r-r 

3 

25 

75 

m-r 

9+4.82 

10 

138.2 

r-m 

16+4.82 

5 

104.1 

(2) 

I-r 

4 

5 

20 

I-m 

17+4.82 

5 

109 

OR  25  percent  usage 

r-r 

3 

10 

30 

m-r 

9+4.82 

3 

41.46 

r-m 

16+4.82 

3 

62.46 

I-r 

4 

3 

12 
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Percent  Total  Clocks 

I-m  17+4.82  3  65.46 

I-r  4  3  12 


835.024 


Logical  time  = 


835.04  x  200  nsec 
100 


Logical  time  -  1.670  nsec  or  8.35  clocks 
A2.7  Test  and  Branch  Instructions 


Jump  on,  Less  than 


Jmp  taken 

8 

25 

200 

Not  taken 

4 

25 

100 

JMP 

Int.  segment 

7 

10 

70 

Int.  segment 

7 

5 

35 

Int.  segment 

3 

20 

60 

mem 

7+EA 

5 

59.1 

Int.  segment 

16+EA 

10 

208.2 

732.3 


732.3  x  200 

Test  and  branch  =  ■  ■  —  ■ 

100 


Test  and  branch  time  =  1.464  y  sec  or  7.32  clocks 
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1 
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A2.8  I/O  Control  Instructions 


Interrupt 

25  Percent 

Clocks 

Percent 

Total  Clocks 

Type  3 

51 

12.5 

637.5 

Mot  3 

50 

12.5 

625.0 

INTO 

7  96  pass 

52 

7 

364 

8  %  fail 

4 

8 

32 

IRET 

25  96 

24 

25 

600 

CLC 

20  96 

2 

20 

40 

STC 

15  % 

2 

15 

30 

I/O  control  =  "  * 

100 

I/O  Control  time  =  4.657  u  sec  or  23.28  clocks 

A2.9  Average  Instruction  Execution  Times 

In  summary,  then  the  average  execution  time  for  each  Intel  8086  instruction  based 
on  a  200  nsec  clock  is  as  follows: 
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Instruction 


Clocks 


UNCLASSIFIED 

Execution  Time 
(usee) 


Move 

1.556 

7.78 

Add/Sub 

1.972 

9.86 

Multiply 

25.23 

126.15 

Divide 

30.61 

153.05 

Shift 

5.481 

27.09 

Logical 

1.670 

8.35 

Test  and  Branch 

1.464 

7.32 

I/O  Control 

4.657 

23.28 

A2.10  Intel  8086  Throughput 

The  throughput  of  the  8086  may  now  be  determined  for  the  mixes  cited  in  Table  A-l. 
The  results  are  tabulated  in  Table  A-2.  It  is  interesting  to  note  how  multiply/divide 
operations  significantly  affect  throughput.  In  the  F-4  fire  control  case,  a  21  percent 
multiply /divide  load  diminishes  the  throughput  by  four  times  over  inertial  NAV  mix  where 
the  load  was  only  0.5  percent. 
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A3  SFMCS  Quad  Throughput 

The  throughput  for  several  architectural  cases  will  now  be  examined. 

The  cases  will  be  developed  to  measure  both  single  and  multiple  processor 
throughputs. 

A3.1  Case  1.  Four  Processors  Sharing  a  Common  Memory  (no  time  phasing) 

The  timing  and  arbitration  rules  for  this  case,  (Figure  A-l),  are  found 
in  Figure  A-2. 

A3.1.1  With  Memory  Access  Conflicts 

In  order  to  determine  CPU  waiting  time,  a  25  clock  sample  will  be  used.  Referring 
to  Figure  A-2  y  pi  encounters  14  wait  states.  The  waiting  time  becomes: 


Waiting  time  (y  pi)  = 

14  wait  clock 

25  clocks 

=  56  percent 

Similarly  for  y  p2  we  have 

14 

Waiting  time  (y  p2)  = 

— 

=  56  percent 

25 

Similarly  for  y  p3  we  have 

14 

Waiting  time  (y  p3)  = 

25 

=  56  percent 
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•  STANDARD  AIRBORNE  MIX  APPLIED  TO 
ALL  PROCESSORS 

•  SYSTEM  THROUGHPUT  =  816  KOPS  (NO  TIME  PHASING) 

•  SYSTEM  THROUGHPUT  =  1242  KOPS  (WITH  TIME  PHASING) 


Figure  A-l  -  Case  1,  Four  Microprocessors  Sharing  a  Common  Memory 
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Figure  A-2  -  Case  1,  Four  Processors  with  Shared  Memory,  No  Time  Phasing 
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-as 


and  for  Mp4 


UNCLASSIFIED 


19 

Waiting  time  (Mp4)  =  -  =  76  percent 

25 


Applying  the  standard  airborne  mix  to  all  processors  we  get: 


Mpl  =  304.5 

Msec  +  304.5  (0.56) 

=  475 

Msec  = 

210  KOPS 

Mp2  =  304.5 

Msec  +  304.5  (0.56) 

=  475 

Msec  = 

210  KOPS 

Mp3  =  304.5 

Msec  +  304.5  (0.56) 

=  475 

Msec  = 

210  KOPS 

Mp4  =  304.5 

Msec  +  304.5  (0.76) 

=  535 

Msec  = 

186  KOPS 

Quad  Throughput 

816  KOPS 

A3.1.2  Without  Memory  Access  Conflicts 

If  the  system  were  to  run  without  memory  access  conflicts,  the  max  throughput 
would  be: 


Quad  Throughput  =  328  KOPS  x  4  =  1312  KOPS 
The  gain  of  this  system  over  the  single  processor  case  is: 

816 

Gain  over  1  processor  =  -  =2.4 

328 


or 


The  percent  of  throughput  utilized  is 


816 

-  =  62  percent 
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1312 


1 


UNCLASSIFIED 

A3.2  Case  2.  Four  Processors  Sharing  a  Common  Memory  with  Time  Phasing 

The  timing  diagram  for  this  case  together  with  arbitration  rules  is 
found  in  Figure  A-3. 

The  waiting  time  for  each  processor  becomes: 


Waiting  time  (Mpl)  = 

0 

25 

=  0  percent 

(Mp2)  = 

1 

25 

=  4  percent 

(Mp3)  = 

2 

25 

=  8  percent 

(Mp4)  = 

3 

25 

=  12  percent 

Again  applying  the  standard  airborne  mix  to  each  processor  we  get: 


Mpl  =  304.5  Msec  + 
Mp2  =  304.5  Msec  + 
Mp3  =  304.5  Msec  + 
Mp4  =  304.5  Msec  + 


304.5  (0) 

=  304.5 

304.5  (0.04) 

=  316.6 

304.5  (0.08) 

=  328.8 

304.5  (0.12) 

=  341.4 

Msec  =  328  KOPS 
Msec  =  315  KOPS 
Msec  =  304  KOPS 
Msec  =  293  KOPS 


Quad  Throughput 
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Figure  A-3  -  Case  2t  Four  Processors  with  Shared  Memory 
using  Time-Phased  Ring  Technique 
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The  gain  of  this  system  over  the  single  processor  case  becomes: 

1242 

Gain  over  1  processor  =  -  =  3.78 

328 


or 


1242 

Percent  of  throughput  utilized  =  -  =  94  percent 

1312 

The  effect  of  time  phasing  can  be  seen  by  comparing  the  two  cases.  This  increase 
in  throughput  becomes: 

Gain  by  using  time  phasing  =  94  -  62  =  32  percent 

A3.3  Case  3.  Four  Processors  with  Dedicated  Program  Memories  Sharing  a  Common 
Data  Memory  Without  Time  Phasing 

In  this  configuration,  Figure  A-4  the  program  is  split  so  that  the  program  resides  in 
a  local  memory  with  shared  data  between  processors  in  common  memory.  The  split  for 
memory  access  used  is  75  percent  to  local  for  instructions  and  25  percent  for  shared  data. 

It  is  noted  that  the  shared  memory  is  utilized  100  percent  of  the  time.  Again 
applying  the  standard  airborne  mix,  the  throughput  becomes  for  upl: 

Access  time  from  local  =  304  x  75  percent  =  228  usee 

Access  time  from  common  =  304  x  25  percent  =  76  usee 

The  access  time  to  common  memory  involves  wait  states,  and  since  memory  is  100 
percent  utilized  the  same  percent  wait  time  that  was  determined  in  Case  1  applies. 

Access  time  to  common  =  76  usee  +  76  (55  percent)  =  117.8  usee 
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THROUGHPUT  *  ittS  KOPS 


Figure  A-4  -  Case  3,  Four  Processors  with  Dedicated  Program  Memories 
and  Shared  Data  Memory  Without  Time  Phasing 
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The  total  time  for  the  100  standard  airborne  mix  instructions  becomes: 


Total  time  =  228  +  117.8  =  345.8  ysec 


1 

ypl  throughput  =  "" 

3.458  ysec 

yp2 

yp3 

yp4  =  304  (0.75)  +  76  +  76  (0.76) 


=  289  KOPS 

=  289  KOPS 
=  289  KOPS 
=  276  KOPS 


Quad  Throughput:  1143  KOPS 


The  gain  over  the  single  procensor  case  is 


1143 

Gain  over  1  processor  =  -  =  3.48 

328 


A3.4  Case  4,  Four  Processors  with  Dedicated  Program  Memories  and  Shared  Data 
Memory  (with  Time  Phasing) 


Using  the  same  techniques  as  in  Case  3  the  throughput  becomes: 

ypl  =  328  KOPS 
yp2  =  326  KOPS 
yp3  =  323  KOPS 
yp4  =  320  KOPS 


Quad  Throughput:  1297  KOPS 

1297 

Gain  over  1  processor  =  ~  =  3.955 

328 
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A3.5  Case  S.  Four  Processors  with  Dedicated  Program  Memories.  Different  Instruction 
Mixes.  Shared  Data  Memory,  Without  Time  Phasing 

In  order  to  determine  system  throughput  sensitivity  on  algorithms,  a  different 
instruction  mix  is  applied  to  each  processor: 

ypl  =  Standard  Airborne  Mix 
yp2  =  F-15  Auto  Flight 
yp3  =  R-F4  Inertial  NAV 
yp4  =  F-4  Fire  Control 

ypl  Access  for  instruction 
ypl  Access  for  data 

The  common  memory  wait  in  time  is: 

76  +  76  (55  percent)  =  117.8  ys 


ypl  total  time 

similarly, 

=  228  +  117.8  =  345.8  ysec 

=  289  KOPS 

yp2 

=  319  KOPS 

yp3 

=  430  KOPS 

yp4 

=  122  KOPS 

Quad  Throughput: 

1160  KOPS 

The  average  throughput  for  the  single  processor  case  is  331.25  HOPS,  (Table  A-2). 
The  gain  over  the  single  processor  case  is: 

1160 

Gain  over  1  Processor  =  “““  =  3.5 

331.25 
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304  ysec 
275  ysec 
204  ysec 
692  ysec 


304  x  75  percent 
304  x  25  percent 


=  228  ysec 
=  76  ysec 


sec 


UNCLASSIFIED 

In  comparision  to  the  single  algorithm  of  Case  3,  the  above  result  shows  that 
executing  different  instruction  mixes  in  each  microprocessor  does  not  have  a  significant 
effect  on  the  overall  throughput  of  the  quad. 
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APPENDIX  B 

REAL-TIME  MISSILE  SIMULATION  WITH  SFMCS 

B1  Introduction 

Given  a  high-speed  modular  processing  system  such  as  the  SFMCS,  it  became 
apparent  that  the  potential  existed  to  significantly  reduce  the  present  high  cost  of  missile 
simulations  using  large-scale,  time-shared  computing  facilities,  by  providing  instead  a  low- 
cost  dedicated  system  using  standard  industry  microcomputers. 

High  throughput  and  software  modularity  could  be  addressed  through  "super¬ 
federation",  i.e.,  assigning  one  microcomputer  per  major  airframe  component  or  func¬ 
tional  element.  TTie  net  result  of  this  approach  would  therefore  be  aimed  at  rapidly 
adapting  any  given  airframe  model  to  a  new  or  improved  version,  as  is  typically  the  case 
during  the  course  of  specific  missile  development^  the  substitution  of  alternate  prepro¬ 
grammed  microcomputers.  Further,  the  time-sharing  and  delays  associated  with  a  single 
large-scale  computing  facility  would  be  eliminated  through  the  replication  of  small 
dedicated  microcomputer  systems. 

The  following  paragraphs  describe  the  analysis  of  missile  simulation  functions  and 
resulting  computer  performance  requirements  based  upon  the  expanded  SFMCS  as  shown 
in  Figure  B-l.  Figure  B-l  shows  functions  which  immediately  come  to  mind  in  terms  of 
the  missile  airframe  application.  The  following  paragraphs  outline  the  nature  of  missile 
airframe  functions  for  simulation  on  the  SFMCS. 

Bl.l  Missile  Aero  Model 

The  functional  aerodata  has  been  linearized  to  stability  coefficients  which  are  a 
function  of  Mach  number.  Since  these  terms  are  relatively,  slowly  variable,  and  they  only 
act  as  multipliers  in  the  dynamic  control  loops  (Subsection  Bl.6.5),  they  are  computed  at 
the  low  frequency  rate  in  the  simulation  model. 
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Figure  B-l  -  High-Performance  Super-Federated  Microcomputer  System 

For  Missile  Simulation 

The  aero  derivatives  are  as  follows: 

Roll 

Cigg  is  the  aerodynamic  roll  effectiveness  coefficient,  f(Mach) 

Clp  is  the  roll  damping  coefficient,  f(Mach) 

Pitch/Yaw 

Cm  60  is  the  lateral  moment  effectiveness  coefficient  of  the  control 
surfaces  at  the  aerodynamic  reference  point,  -f(Mach) 

Cmcto  is  the  lateral  moment  partial  with  respect  to  inplane  angle-of 
attack  at  the  reference  point,  -f(Mach) 
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Cm  q  is  the  aerodynamic  damping  derivative,  f(Mach) 

Cnot  is  the  aerodynamic  force  derivative  with  respect  to  angle-of- 
attack,  f(Mach) 

Cn  g  is  the  aerodynamic  force  derivative  with  respect  to  control  sur¬ 
face  deflection,  f(Mach) 

Other  aerodynamic  parameters  are  computed  in  order  to  develop  the  forces  and 
moments  acting  on  the  missile.  These  terms  depend  on  certain  missile  states,  velocity, 
altitude,  and  orientation. 

Velocity  (VM)  and  altitude  (RM2)  are  obtained  from  the  missile  translational 
motion  model.  Altitudes  (<))M,  0M,  \j)M)  are  obtained  by  snapshooting  the  high  fre¬ 
quency  states  developed  in  the  missile  dynamics  model. 

The  earth  (inertial  reference  frame)  to  missile  transformation  matrix  is  then 
computed. 

(ME)  =  (0M>  <0M>  <%) 

Then  the  velocity  vector  is  transformed  to  the  missile  coordinate  frame. 

^  -  (ME)  VM 

from  which  the  two  components  of  angle-of-ettack  are  generated. 
ayo  =  tan_1(-FMM3/VMM1) 

azo  =  tan-1(VMM2/VMMi) 

where  the  subscript  refers  to  the  value  of  these  quantities  at  the  start  of  the  low  fre¬ 
quency  calculation  cycle.  Note  that  in  the  missile  dynamics  model  that  these  quantities 
are  updated  between  the  low  frequency  calculations  at  the  high  frequency  data  rate. 

Dynamic  pressure  and  Mach  number  are  also  computed  as  part  of  this  model.  Air 
density  and  velocity  of  sound  are  table  look-ups, 

P  »  f(RM2) 

Vs  *  f(RM2) 
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and  Mach  number  is  developed  as 
MACH  -  |VM|/VS 

and  dynamic  pressure  is  given  by 
Q  -  1/2  P I VM| 2 

Note  that  this  model  neglects  the  effect  of  wind. 
B1.2  Missile  Physical  Properties 


The  characteristics  of  the  missile  that  are  categorized  as  missile  physical 
properties  are  the  thrust,  mass,  inertia,  etc.  This  model  generates  those  quantities  which, 
when  combined  with  the  aerodynamic  forces  and  moments,  produce  the  rotational  and 
translational  acceleration  of  the  missile. 

The  basis  of  the  model  is  the  defined  time  history  of  the  motor  thrust,  and  the 
relationship  of  this  to  the  other  physical  properties  of  the  missile.  Thus,  the  thrust  pro¬ 
file  is  determined  from  a  table  of  thrust  level  specified  at  arbitrary  time  points,  with 
linear  interpolation  for  intermediate  values.  That  is; 

THRUST  =  f(t) 

There  is  a  single  state  variable  associated  with  this  model  which  is  the  total 
impulse,  or  energy  expended  while  thrusting.  That  is  obtained  via  the  integration  of  the 
rate  of  energy  expenditure,  thrust. 

IMP  =  0  f  t  THRUST' dt 

The  other  physical  properties  are  directly  proportional  to  energy  (fuel)  expended, 
so  that  they  may  be  drived  from  impulse. 
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MASS  =  MASS IC  +  (3MASS/3IMp)  IMP 
Ixx  =  IxxIC  +  <3lxx/3IMp)  IMP 

!yy  «  IyyIC  +  <3Iyy/3IMP>  IMP 

AxCq  +  AXqgic  +  (SAXqq/Sjjjp)  IMP 

These  are  missile  mass  (MASS),  roll  moment  of  inertial  (Ixx),  pitch-yaw  moment  of 
inertia  (Iyy),  and  center  of  gravity  displacement  (Xqq)  from  the  longitudinal  reference 
point  (at  the  missile  nose). 

The  partial  derivatives  are  assumed  constant  over  the  entire  range  of  thrust,  with 
an  average  value  used  to  insure  the  proper  parameters  in  the  missile  glide  condition. 

Other  properties  of  the  missile  necessary  to  scale  the  aerodynamic  quantities  to 
force  and  moment  are  as  follows: 

XQ  =  Reference  point  at  which  moment  data  is  taken 
S  =  Aerodynamic  reference  area 

C  =  Reference  dimension,  pitch  and  yaw 
b  =  Reference  dimension,  roll 

B1.3  Missile  Translational  Motion 

Missile  translational  motion  is  described  by  integration  of  Newton's  equations  in  an 
inertial  reference  frame.  The  acceleration  vector  developed  from  aerodynamic  force, 
missile  thrust,  and  other  forces  applied  to  the  missile  (such  as  launcher  constraints)  is 
converted  to  an  Earth  fixed  reference  frame  (E-frame)  and  combined  with  gravity  to 
produce  the  three-component  acceleration  vector  which  is  successively  integrated  to 
velocity  and  position. 

The  acceleration  in  the  missile  fixed  axes  is  given  by, 

NM  =  (NMlf  NM2,  NM3)t 

The  conversion  to  the  E-frame  is  achieved  through  the  transformation  matrix 
describing  the  relative  orientation  of  the  missile  coordinates  and  the  inertial  reference. 
When  gravity  is  added,  the  total  inertial  acceleration  results. 
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AM  =  G  t  (ME)t  NM 

This  is  then  integrated  to  obtain  the  velocity  and  position  states 

VM  =  VMIC  +  /  AM*  dt 
RM  =  RMJC  +  /  VM*dt 

These  states  are  typically  assigned  to  the  low  frequency  regime  of  state  variables 
in  the  continuous  subsystem.  Because  these  equations  have  been  decoupled  from  the  rota¬ 
tional  modes  of  the  missile  (via  MET),  there  is  no  significant  component  of  the  higher 
frequency  motion  present  in  these  equations.  The  missile  linear  acceleration  due  to  the 
aerodynamic  and  internal  forces  (NM)  is  generated  from  the  aerodynamic  parameters  and 
the  physical  properties  of  the  missile. 

NMX  =  (THRUST  +qS(CDo  +  CXTHR) )/MASS 

where 

CXTHR  =  A  dra9  correction  coefficient  to 

account  for  drag  increase  when  motor 
burns  out  (switched  to  zero  prior  to 
burnout) 

cDo  =  The  base  drag  coefficient,  f(Mach) 

qs 

NM2  =  MASS  (CNCt'ap  +  CN«’6P> 

qS 

NM3  =  MASS  <CNa’aY  +  CN5-(5y) 

B1.4  Relative  Geometry  Model 

The  relative  geometry  model  combines  the  missile  and  target  states  to  produce 
several  quantities  of  prime  importance  in  the  simulation.  The  driving  terms  to  the  guid¬ 
ance  of  the  missile  are  the  relative  position  and  velocity  of  the  target  with  respect  to  the 
missile. 
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The  relative  range  is  merely  the  difference  between  the  target  and  missile  position 
vectors, 

RTM  =  RT  -  RM 

while  the  relative  velocity  is  given  by, 
vm  =  VT  -  VM 

The  line-of -sight  to  the  target  (as  seen  from  the  missile)  is  obtained  by  unitizing 
the  relative  range  vector, 

LOS  =  u  (RTM) 

The  closing  velocity  is  the  relative  velocity  along  the  line-of-sight,  that  is 

vc  =  -(vm  •  Eos) 

and  projected  time-to-go  is  given  by, 

tgo  =  Vc  •  1 rtm|  /  |vtm|2 

B1.5  Miss  Distance  Calculation 

When  a  simulated  flight  is  terminated  due  to  the  time-to-go  to  the  target 
becoming  negative,  a  miss  distance  iteration  is  made.  This  calculation  is  based  on  the 
missile  and  target  states  near  intercept,  and  assumes  that  over  the  short  iteration  time 
missile  and  target  acceleration  is  constant. 

The  range  between  the  missile  and  target  can  be  expressed  as  a  quadratic  in  time- 

RF  +  R  +  R-tgo  +  R- tgo 2/2 
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where 


R,  R  and  R  are  the  relative  range,  velocity,  and  acceleration  that  exist  between 
the  target  and  missile  at  the  time  the  simulation  run  is  terminated,  but  prior  to  the  miss 
distance  iteration. 

The  relative  velocity  is  given  by, 

V  =*  R  +  R’tgo 

The  minimum  approach  to  the  target  by  the  missile  is  defined  by  the  condition, 

R-V  =  0 

This  requires  the  solution  of  a  cubic  equation  which  is  solved  by  a  Newton-Raphson 
iteration  method.  The  iteration  is  performed  until  the  time-to-go  has  settled  to  within  1 
Msec  of  the  solution,  or  if  100  passes  have  been  made  through  the  iteration  cycle.  The 
result  is  the  amount  of  time  the  extrapolation  covered,  the  magnitude  of  the  miss  dis¬ 
tance  vector,  and  the  three  components  of  the  miss  distance  vector. 

B1.6  High  Frequency  Models 


The  high  frequency  regime  contains  the  rotational  motion  of  the  missile  and  the 
models  of  the  missile  subsystems,  the  control  actuator  section,  inertial  instruments,  the 
seeker  gimbal  system,  and  the  receiver. 

Bl.6.1  Control  Actuator  Section  (CAS) 

The  control  actuator  modeled  as  a  first  order  transfer  function  with  rate 
and  position  limits.  There  are  four  such  actuators,  one  for  each  control  surface.  The 
block  diagram  for  a  single  actuator  is  illustrated  in  Figure  B-2.  Table  B-l  lists  the  inputs, 
outputs,  and  parameters  of  the  CAS  model. 
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Figure  B-2  -  Control  Actuator  Section 


TABLE  B-l 

CAS  MODEL  QUANTITIES 


Type 

Quantity 

Nominal  Value 

Definition 

Input 

6c;  (1-4) 

- 

Control  surface  angle 
command 

Output 

6  ;  d-4) 

- 

Control  surface  angle 

Parameter 

T  CAS 

0.01  sec 

Response  time  constant 

6  U4 

5.236  rad/sec 

Rate  limit 

6  LM 

0.5236  rad 

Angle  limit 
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Bl.6.2  Inertial  Instrument  Models 


The  Level-1  model  of  the  instruments  represents  the  more  significant  errors 
present  in  the  measurement  process. 

These  errors  are  added  to  the  quantities  to  be  measured  as  determined  by 
the  kinematics.  The  errors  include  the  effects  of  gyro  drift  and  accelerometer  bias,  but 
not  the  dynamics  of  response  of  the  measurement  devices.  Also  included  are  the  output 
limits  representing  the  dynamic  range  of  the  devices. 

The  gyro  measurements  are  modeled  by  the  following  expressions: 


where 


A\  ( 

bgyilm] 

Wp  =  LIM  /  VW' 

BGY2LM I 

t 

BGY3LMJ 

WM*!  =  WMl  +  DDGIN1B 

WM'2  =  WM2  +  DDGIN2B 

WM' 3  =  WM3  +  DDGIN3B 

and 


DDGIN1B,  DDGIN2B,  DDGIN3B  are  the  zero-gee  drift  rates  of  the  respec¬ 
tive  gyros.  These  are  nominally  zero  but  may  be  used  for  error  sensitivity  studies. 

Each  of  the  error  sources  is  determined  from  statistical  distributions  with 
specified  mean  and  variance.  For  Monte-Carlo  analysis  a  new  value  of  each  parameter  is 
chosen  for  each  flight  in  a  sample  set,  to  represent  the  errors  typical  of  random  missiles. 

The  accelerometer  measurements  are  modelled  by  the  following 

expressions: 

[BAClLMj ) 

BAC2LM 
BAC3LH]  ) 
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where 


NP'X  »  NMX  +  NPlo 
NP'2  -  nm2  +  np2o 
np'3  =  nm3  +  NP3o 

Bl.6.3  Head  Control  and  Stabilization  Model 

The  model  of  the  missile  seeker  is  defined  to  represent  head  control  and 
stabilization.  It  is  configured  to  operate  in  the  inertially  stabilized  model,  so  as  to  repre¬ 
sent  the  operation  of  the  missile  seeker.  The  inertially  stabilized  mode  uses  gyros 
mounted  on  the  antenna  to  achieve  the  desired  rate  stabilization. 

The  head  control  and  stabilization  system  is  pictured  in  block  diagram  form 
in  Figure  B-3.  There  are  two  control  axes,  an  outer  gimbal  which  is  the  pitch  axis,  and  an 
inner  gimbal  which  is  the  yaw  axis.  Head  control  is  achieved  by  rate  commanding  the 
seeker  in  the  inertially  stabilized  mode.  Base  motion,  which  is  developed  by  the  missile 
rotation  and  for  the  inner  gimbal  by  outer  gimbal  motion,  is  decoupled  from  the  seeker 
track  loop  by  the  inherent  inertial  stabilization  of  the  electric  motor  drive,  and 
supplemented  by  gyro  feedback. 

The  stabilization  loop  has  been  simplified  to  a  first  order  response,  but 
improved  low  frequency  stabilization  could  be  achieved  with  a  more  complex 
representation  of  the  stabilization  loop  dynamics.  The  amount  of  coupling  through  the 
motor  and  gearing  is  defined  by  the  parameters  KGR2  and  KGR3.  If  these  are  zero,  then 
no  base  motion  is  coupled  into  the  antenna  drive,  while  when  they  are  1,  the  base  motion 
is  directly  coupled. 

Bl.6.4  Receiver  Measurement  Model 

The  receiver  measurement  model  produces  the  equivalent  boresight  errors 
from  the  geometric  tracking  errors  and  the  various  noise  sources  present  in  the  measure¬ 
ment  process. 

The  geometric  error  is  derived  from  the  line-of-sight  to  the  apparent 
(glinting)  target  and  the  seeker  orientation  with  idealized  antenna  patterns  included  in  the 
model.  The  line-of-sight  vector  is  transformed  to  the  antenna  coordinates  by: 
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u'a  ■  (AE)  u'e 


where 


(AB)  -  (0M)  (ME) 

The  idealized  antenna  pattern  model  then  produces  the  monopulse  channel 
signals  as  follows, 

ZV  “  1.0 

ApV  *  U*A2 
AyV  *  -u’A3 


and  the  boresight  error  (prior  to  adding  other  errors)  is 

e2  “  ty/tv 

e3  ■ 


The  range  and  range  rate  (prior  to  adding  errors)  and  taken  directly  from 
the  geometry.  That  is, 

R  -  Vc 

rtm  *  1rtmI 

The  measurement  noise  is  added  to  produce  angle  errors  and  the  doppler 
velocity  to  be  used  for  target  tracking  and  guidance. 

Receiver  thermal  noise  and  range  independent  noise  are  generated  by  adding 
band  limited  Gaussian  noise  to  the  quantities  that  have  been  derived  from  the  geometry. 
The  bandwidth  is  assumed  constant  as  it  is  representative  of  the  receiver  hardware.  The 
noise  variance  due  to  thermal  noise  is  dependent  upon  the  receiver  signal-to-noise  ratio, 
which  changes  with  the  effective  radar  cross  section  of  the  target  and  the  power  loss  due 
to  range.  The  signal-to-noise  ratio  (S/N)  is  computed  from, 
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R©4  (S/N0)  aT 
S/N  -  - 

Raw4  *o 

where 

Rq,  S/N0,  and  Oq  =  Normalizing  parameters  to  represent  the 

illuminator  power  and  missile  receiver  gain 
crp  =  The  effective  bistatic  cross  section  of  the 
target  as  determined  by  the  fading  target 
model 

Rtm  =  The  magnitude  of  the  range  vector  between 
missile  and  target 


by, 


The  noise  variance  of  the  signals  (which  is  considered  white  noise)  is  given 
VRN  -  (°2Rn)/(1+S/N)  +  °2SN 


where 


o2rn  -  4>RN/  AtH 
a2SN  „  <()SN/  AtH 


and 


4>rn,  =  The  spectral  densities  of  receiver  and  servo  noise 

(range  independent) 

Atj{  =  The  integration  step  size  for  the  high  frequency 
states  in  the  simulation 
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The  standard  deviation  of  a  Gaussian  distribution  is  determined  from 
the  square  root  of  the  noise  variance,  VR^,  and  then  filtered  so  as  to  represent  the 
noise  bandwidth  as  a  filter  with  a  time  constant  tjj.  The  resultant  noise  is  added  to 
the  geometric  error  along  with  the  other  error  sources  modeled  to  produce  the  guidance 
errors  as  indicated  in  Figure  B-4. 

The  measurement  of  closing  rate  is  assumed  to  be  the  ideal  (geometric) 
closing  velocity,  VC. 

Bl.6.5  Missile  Dynamics  Model 

The  aerodynamic  forces  and  moments  are  developed  by  combining  the  aero¬ 
dynamic  coefficients  with  the  missile  physical  properties.  However,  in  the  case  of  the 
moment  generation,  care  must  be  taken  to  preserve  the  dynamic  integrity  of  these 
calculations.  Since  the  rotational  modes  of  the  missile  are  of  significantly  wider  band¬ 
width  than  the  translational  motion,  an  effectively  smaller  computation  cycle  is  required. 
This  is  achieved,  while  maintaining  computational  efficiency,  through  hybridization  of  the 
airframe  model. 


Figure  B-4  -  Receiver /Error  Source  Model 
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Bl.6.6  Moment  Generation 

Moments  are  not  a  direct  output  of  this  model;  rather,  the  torque/inertia 
ratio,  which  is  the  inertial  acceleration  (rotational),  is  the  quantity  sent  to  the  rotational 
motion  model. 

These  terms  are  developed  in  the  manner  indicated  in  Figure  B-5.  The 
model  is  "hybrid"  in  nature,  in  that  some  terms  are  computed  at  the  high  data  rate  and 
others  at  the  lower  data  rate.  The  low  data  rate  terms  are  typically  multipliers  to  the 
high  data  rate  variables.  The  multipliers  themselves  are  slowly  varying  quantities.  In  this 
role  they  do  not  contribute  phase  errors  to  the  high  bandwidth  loops.  The  simulation  com¬ 
putational  savings  gained  by  this  form  of  implementation  is  that  the  function  generation 
(aerodynamic  coefficients,  missile  physical  parameters)  is  performed  at  a  relatively  low 
data  rate  (see  missile  kinematics  model). 

This  provides  adequate  compensation  to  make  the  quantities  ay  and  ap 
effectively  computed  at  the  high  data  rate. 
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SLOW  CALCULATIONS 
20  idmc 


Qsb 


c  WM> 


Qsb  b 

CVp(M) 

'ks  *  vm 


FAST  CALCULATIONS 
0.5  msec 
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Missile  rotational  motion  is  obtained  by  integration  of  a  modified  set  of 
Euler  equations.  The  missile  body-fixed  axes  system  is  chosen  in  the  direction  of  the 
principal  axes  of  the  missile,  thus  making  the  products  of  inertia  vanish.  There  remains 
the  Euler  form  of  the  rotational  equations  of  motion. 

*XX  ^M1  +  ~  Iyy)  Wj42  »  Tj 

ryy  WM2  +  “  *zz)  Wjjq  Wj^3  ■  T2 

Izz  *Sl3  +  (-fyy  “  Ixx>  WM1  *%!2  *  t3 


where 


T^,  T2,  and  T3  =  The  components  of  torque  applied  to  axes  1 , 

2, and  3 

Ixx,  Iyy.  and  Izz  ~  The  moments  of  inertia  about  axes  1,  2,  and  3 
Wmi,  Wm2»  and  WM3=  The  rotational  rates  about  these  axes 

Since  missiles  are  nearly  symmetrical  in  pitch  and  yaw, 

Iyy?Si  Izz 

and  since  roll  moment  of  inertia  is  typically  only  1  or  2  percent  of  the  lateral  inertia 
Ixx  <<:  Iyy 

so  that  the  equations  can  be  reasonably  simplified  to 
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Ti 

"Ml  =  - 

Jxx 

t2 

WM2  =  -  +  "Ml  "M3 

1  yy 

t3 

WM3  =  -  “  "Ml  WM2 

Iyy 


These  terms  are  the  derivatives  of  the  body  rate  vector,  whose  components 
are  the  state  variables  of  this  model.  Since  these  states  are  in  the  high  bandwidth  control 
loop  of  the  autopilot,  it  is  necessary  to  perform  this  state  integration  at  a  sufficiently 
small  step  size  so  that  no  adverse  lag  is  imparted  by  the  simulation. 

Missile  orientation  is  developed  by  integrating  a  set  of  Euler  rates  driven  by 
the  missile  body  rate  vector. 

These  rates  are  integrated  to  produce  the  Euler  angles  Om,  ^m>  and  0jy[ 
which  define  the  orientation  state  (relative  to  the  inertial  reference)  of  the  missile. 
From  these  angles  we  compute  the  earth-to-missile  transformation  matrix. 

B1.7  Approximate  Computer  Requirements 

The  computer  requirements  are  estimated  based  on  the  speed  of  execution  and  the 
accuracy  required  to  implement  the  control  functions  and  provide  an  accurate  simulation 
of  the  physical  system.  These  requirements  are  defined  for  each  of  the  four  function 
groups  described  earlier. 

Bl.7.1  Guidance  Functions 

The  computers  implementing  the  guidance  function  must  execute  the  algo¬ 
rithms  so  that  the  autopilot  commands  and  seeker  tracking  commands  are  generated  in 
NGT  10  msec.  The  filter  prediction  and  inertial  reference  algorithms  must  be  complete 
prior  to  the  beginning  of  the  next  guidance  cycle  (20  msec).  The  result  is  that  the  com¬ 
puter  can  support  a  guidance  cycle  rate  of  50  Hz  and  the  guidance  and  tracking  com¬ 
mands  suffer  from  no  greater  than  a  10  msec  computer  time  delay  (transport  lag). 
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The  computer  should  have  at  least  16  bits  of  dynamic  range  with  double 
precision  (or  equivalent)  to  maintain  the  inertial  states  to  the  equivalent  of  1  m/sec^ 
acceleration  resolution  over  a  dynamic  range  of  20,000  m. 

The  input/output  functions  should  be  supported  by  modules  of  NLT  10  bits. 

Bl.7.2  Autopilot  Functions 

The  computers  implementing  the  autopilot  function  must  execute  the  algo¬ 
rithms  so  that  the  control  actuator  commands  are  generated  in  NGT  1  msec.  The  Euler 
angle  integration  for  the  altitude  reference  must  be  complete  prior  to  the  beginning  of 
the  next  autopilot  cycle  (2  msec).  The  result  is  that  the  computer  can  support  an  autopi¬ 
lot  cycle  rate  of  500  Hz  and  the  autopilot  stability  path  suffers  from  no  greater  than  a  1 
msec  compute  time  delay  (transport  lag). 

The  computer  should  have  at  least  16  bits  of  dynamic  range. 

The  input/output  functions  should  be  supported  by  modules  of  NLT  10  bits. 

Bl.7.3  Low  Frequency  Physical  Models 

The  computers  implementing  the  low  frequency  physical  models  must 
execute  the  algorithms  in  a  period  NGT  20  msec. 

The  computers  should  have  a  word  size  of  NLT  16  bits,  with  double  precision 
(or  equivalent)  for  the  velocity  and  position  states  of  the  missile  and  target  in  order  to 
achieve  trajectory  accuracy  of  0.1  m  with  a  20,000  m  dynamic  range. 

The  input/output  functions  should  be  supported  by  modules  of  NLT  12  bits. 

Bl.7.4  High  Frequency  Physical  Models 

The  computers  implementing  the  high  frequency  physical  models  must 
execute  the  algorithms  in  a  period  NGT  0.5  msec. 

The  computer  should  have  a  word  size  of  NLT  16  bits. 

The  input/output  functions  should  be  supported  by  modules  of  NLT  12  bits. 
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B1.8  Conclusion 

In  conclusion,  it  appears  quite  feasible  to  perform  real-time  digital  missile  simula¬ 
tions  using  the  SFMCS.  Further,  the  degree  of  configuration/flexibility  and  independence 
from  time-shared  facilities  are  two  unique  advantages. 
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APPENDIX  C 

SFMCS  FUNCTIONAL  BLOCK  DIAGRAMS 


This  appendix  contains  functional  block  diagrams  of  the  basic  quad  and  expanded 
SFMCS  architecture,  which  employs  four  quads  and  one  host  processor.  These  designs 
formed  the  basis  of  the  detailed  timing  analyses  and  simulation  models. 


C-l 

UNCLASSIFIED 


COM  I  MNTnOM  MMMOi 

aotT  wo.  I  wfrmcATiow  no _ yewc*Tio«Mft 

_  pgrriuiT 

OTHfUWtM  1HC*1CD  |CO**T«  NO  “  g| 

—  Mi  OUKKKM  Me  m  _  - —  K 

—  MCNfs  mmmrtM  on  is 

mwtcti  wail*  ■  ■ 

—  WMOUNUMttRS*  cm  . 

_  1  NWC*  DCCJMMJ  * 

iNuctocaiau*  *  _  - 

—  iruCtNCMIU*  5 


Figure  Cl  SFMCS  -  Expanded  System, 

Dual  Quad  Configuration,  Block  Diagram 
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Figure  C3  -  SFMCS  -  Expanded  System,  Detailed  Block  Diagram 
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